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(57) Abstract: The present invention relates to carotenoid 
overproducing bacteria. The genes of the isoprenoid 
pathway in the bacterial hosts of the invention have been 
engineered such that certain genes are either up-regulated 
or down regulated resulting in the production of carotenoid 
compounds at a higher level than is found in the un- modified 
host. Genes that may be up-regulated include the dxs, idi, 
ispB, lytB and ygbBP genes. Additionally it has been found 
that a partial disruption of the yjeR gene has the effect of 
enhancing carotenoid production. 
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TITLE 

INCREASING CAROTENOID PRODUCTION IN BACTERIA VIA 
CHROMOSOMAL INTEGRATION 
This application claims the benefit of U.S. Provisional Application 
5 No. 60/434,61 8 filed December 1 9, 20'02. 

FIELD OF THE INVENTION 
This invention is in the field of microbiology. More specifically, this 
invention pertains to carotenoid overproducing bacterial strains. 

BACKGROUNn OF THE INVENTION 
io Carotenoids are pigments that are ubiquitous throughout nature 

and synthesized by all oxygen evolving photosynthetic organisms and in 
some heterotrophic growing bacteria and fungi. Industrial uses of 
carotenoids include pharmaceuticals, food supplements, electro-optic 
applications, animal feed additives, and colorants in cosmetics, to mention 
15 a few. Because animals are unable to synthesize carotenoids de novo, 
they must obtain them by dietary means. Thus, manipulation of 
carotenoid production and composition in plants or bacteria can provide 
new or improved sources of carotenoids. 

Carotenoids come in many different forms and chemical structures. 
20 Most naturally occurring carotenoids are hydrophobic tetraterpenoids 
containing a C40 methyl-branched hydrocarbon backbone derived from 
successive condensation of eight C5 isoprene units (isopentenyl 
pyrophosphate, IPP). In addition, novel carotenoids with longer or shorter 
backbones occur in some species of nonphotosynthetic bacteria. 
25 The genetics of carotenoid pigment biosynthesis are well-known 

(Armstrong et al., J. Bact, 176: 4795-4802 (1994); Armstrong et al., Annu. 
Rev Microbiol, 51 :629-659 (1997)). This pathway is extremely well- 
studied in the Gram-negative, pigmented bacteria of the genera Pantoea, 
formerly known as Erwinia. In both E herbicola EHO-10 (ATCC 39368) 
30 and £. uredovora 20D3 (ATCC 1 9321), the crt genes are clustered in two 
operons, crtZ and crtEXYIB (US 5,656,472; US 5,545,816; 
US 5,530,189; US 5,530,188; and US 5,429,939). 

Isoprenoids constitute the largest class of natural products in 
nature, and serve as precursors for sterols (eukaryotic membrane 
35 stabilizers), gibberelinns and abscisic acid (plant hormones), 

menaquinone, plastoquinones, and ubiquinone (used as carriers for 
electron transport), tetrapyrroles as well as carotenoids and the phytol 
side chain of chlorophyll (pigments for photosynthesis). All isoprenoids 
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are synthesized via a common metabolic precursor, isopentenyl 
pyrophosphate (IPP). Until recently, the biosynthesis of IPP was generally 
assumed to proceed exclusively from acetyl-CoA via the classical 
mevalonate pathway. However, the existence of an alternative, 
mevalonate-independent pathway for IPP formation has been 
characterized in eubacteria and green algae. 

E. coli contains genes that encode enzymes of the mevalonate- 
independent pathway of isoprenoid biosynthesis (Figure 1). In this 
pathway, isoprenoid biosynthesis starts with the condensation of pyruvate 
with glyceraldehyde-3-phosphate (G3P) to form deoxy-D-xylulose via the 
enzyme encoded by the dxs gene. A host of additional enzymes are then 
used in subsequent sequential reactions, converting deoxy-D-xylulose to 
the final C5 isoprene product, isopentenyl pyrophosphate (IPP). IPP is 
converted to the isomer dimethylallyl pyrophosphate (DMAPP) via the 
enzyme encoded by the idi gene. IPP is condensed with DMAPP to form 
C10 geranyl pyrophosphate (GPP) which is then elongated to C15 
farnesyl pyrophosphate (FPP). 

FPP synthesis is common in both carotenogenic and non- 
carotenogenic bacteria. E. coli does not normally contain the genes 
20 necessary for conversion of FPP to p-carotene (Figure 1). Enzymes in the 
subsequent carotenoid pathway generate carotenoid pigments from the 
FPP precursor and can be divided into two categories: carotene backbone 
synthesis enzymes and subsequent modification enzymes. The backbone 
synthesis enzymes include geranyl geranyl pyrophosphate synthase 
25 (CrtE), phytoene synthase (CrtB), phytoene dehydrogenase (Crtl) and 
lycopene cyclase (CrtY/L), etc. The modification enzymes include 
ketolases, hydroxylases, dehydratases, glycosylases, etc. 

E. coli is a convenient host for heterologous carotenoid production. 
Most of the carotenogenic genes from bacteria, fungi and higher plants 
30 can be functionally expressed in E. coli (Sandmann, G., Trends in Plant 
Science, 6:14-17 (2001)). Furthermore, many genetic tools are available 
for use in E. coli, a production host often used for large-scale 
bioprocesses. 

Engineering E. coli for increased carotenoid production has 
35 previously focused on overexpression of key isoprenoid pathway genes 
from multi-copy plasmids. It has been postulated that the total amount of 
carotenoids produced in non-carotenogenic hosts is limited by the 
availability of terpenoid precursors (Albrechtetal., Biotechnol. Lett., 

2 
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21791-795 (1999)). Several studies have reported between a 1.5X and 
50X increase in carotenoid formation in such E. coli systems upon cloning 
and transformation of plasmids encoding isopentenyl diphosphate 
isomerase (idi), deoxy-D-xylulose-5-phosphate (DXP) synthase (dxs), DXP 
reductoisomerase (dx/) from various sources (Kim, S., and Keasling, J., 
Biotech. Bioeng., 72:408-415 (2001); Mathews, P., and Wurtzel, E., Appl. 
Microbiol. Biotechnol., 53:396-400 (2000); Harker, M., and Bramley, P., 
FEBS Letter., 448:115-119 (1999); Misawa, N., and Shimada, H.,J. 
Biotechnol., 59:169-181 (1998); Liao et al., Biotechnol. Bioeng., 62:235- 
241 (1999); and Misawa et al.. Biochem. J., 324:421-426 (1997)). In 
addition, it has also been reported that increasing isoprenoid precursor 
concentration may be lethal (Sandmann, G., supra). 

The highest level of carotenoids produced to date in E. coli are 
around 1 .57 mg/g dry cell weight (DCW). In contrast, engineered strains 
is of Candida utilis produce 7.8 mg of lycopene per gram of dry cell weight of 
lycopene (Sandmann, supra). It has been speculated that the limits for 
carotenoid production in a non-carotenogenic host, such as E. coli, had 
been reached at the level of around 1.5 mg/g DCW due to carotenoid 
overload of the membranes, disrupting membrane functionality. Because 
20 of this, it has been suggested that the future focus of engineering E. coli 
for high levels of carotenoid production should be on formation of 
additional membranes (Albrecht et al., supra). 

Most of the work to date in the metabolic engineering of 
isoprenoids has been done using carotenoids primarily because of the 
25 easy color screening. Engineering an increased supply of isoprenoid 
precursors for increased production of carotenoids is necessary. It has 
been shown that a rate-limiting step in carotenoid biosynthesis is the 
isomerization of IPP to DMAPP (Kajiwara et al., Biochem. J., 423: 421-426 
(1997)). It was also found that the conversion from FPP to GGPP is the 
30 first functional limiting step for the production of carotenoids in E. coli 
(Wang et al., Biotchnol. Prog., 62: 235-241 (1999)). Transformation of 
E. coli for overexpression of the dxs, dxr, and idi genes was found to 
increase production of carotenoids by a factor of 3.5 (Albrecht et al., 
supra). To avoid competition from other pathways and to relieve the 
35 limiting steps, a GGPP synthase (gps) from Archaroglobus fulgidus was 
cloned in a multi-copy expression vector and over-expressed in E. coli, 
along with the E. coli idi gene (Wang et al., supra). These examples show 
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that a multi-copy expression vector has been widely used for the 
metabolic engineering for the production of carotenoids. 

The problem to be solved, therefore, is to engineer and provide 
microbial hosts which are capable of producing increased levels of 
5 carotenoids. Applicants have solved the stated problem by making 

.modifications to the E. coli chromosome, increasing p-carotene production 
up to 6 mg per gram dry cell weight (6000 PPM), an increase of 30-fold 
over initial levels; with no lethal effect. 

SUMMARY OF THE INVENTION 
io The invention provides a carotenoid overproducing bacteria 

comprising the genes encoding a functional carotenoid enzymatic 
biosynthetic pathway wherein the dxs, idi and ygbBP genes are 
overexpressed and wherein the yjeR gene is down regulated. 

Additionally the invention provides a carotenoid overproducing 
15 bacteria comprising the genes encoding a functional carotenoid enzymatic 
biosynthetic pathway wherein the dxs, idi, ygbBP and ispB genes are 
overexpressed. Optionally the lytB gene may also be overexpressed to 
further enhance the carotenoid production. 

In a preferred embodiment, the invention provides a carotenoid 
20 overproducing bacteria selected from the group consisting of a strain 
having the ATCC identification number PTA-4807 and a strain having the 
ATCC identification number PTA-4823 

In another embodiment the invention provides a method for the 
production of a carotenoid comprising: 
2 5 a) growing the carotenoid overproducing bacteria of the invention 

the bacteria overexpressing at least one gene selected from the 
group consisting of dxs, idi ygbBP, ispB, lytB, dxr t wherein yjeR 
is optionally downregulated, for a time sufficient to produce a 
carotenoid; and 

30 b) optionally recovering the carotenoid from the carotenoid 

overproducing bacteria of step (a). 

BRIEF DESCRIPTION OF THE DRAWINGS 
AND SEQUENCE DESCRIPTIONS 
Figure 1 outlines the isoprenoid and carotenoid biosynthetic 
35 pathways used for production of p-carotene in E. coli. 

Figure 2 shows the strategy for chromosomal integration of 
promoter or full gene sequences and stacking the strong promoter- 
isoprenoid gene fusions. 

4 
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Figure 3 shows PCR analysis of chromosomal insertions. 
Figure 4 shows PCR analysis of chromosomal insertions. 
Figure 5 shows PCR analysis of chromosomal insertions. 
Figure 6 shows the plasmid map of pSUH5. 
5 Figure 7 shows the plasmid map of pPCB1 5. 

Figure 8 shows the strategy for creating E. coli Tn5 mutants which 
have increased carotenoid production. 

Figure 9 shows increased p-carotene production from an E. coli 

Tn5 mutant. 

io Figure 1 0 shows insertion site of Tn5 in the Y1 5; yjeR::Tn5 

mutation. 

Figure 11 shows p-carotene production by the engineered E. coli 
strains of the present invention. 

Figure 12 shows bacteriophage P1 mediated transduction and 
15 parallel combinatorial stacking used in the optimization of p-carotene 
production. 

The invention can be more fully understood from the following 
detailed description and the accompanying sequence descriptions, which 
form a part of this application. 

20 The following sequences comply with 37 C.F.R. 1 .821-1 .825 

("Requirements for Patent Applications Containing Nucleotide Sequences 
and/or Amino Acid Sequence Disclosures - the Sequence Rules") and are 
consistent with World Intellectual Property Organization (WIPO) Standard 
ST.25 (1998) and the sequence listing requirements of the EPO and PCT 

25 (Rules 5.2 and 49.5(a-bis), and Section 208 and Annex C of the 

Administrative Instructions). The symbols and format used for nucleotide 
and amino acid sequence data comply with the rules set forth in 
37 C.F.R. §1.822. 



Gene/Protein 
Product 


Source 


Nucleotide 
SEQIDNO 


Amino Acid 
SEQ ID NO 


CrtE 


Pantoea stewartii 


1 


2 


CrtX 


Pantoea stewartii 


3 


4 


CrtY 


Pantoea stewartii 


5 


6 


CrtI 


Pantoea stewartii 


7 


8 


CrtB 


Pantoea stewartii 


9 


10 


CrtZ 


Pantoea stewartii 


11 


12 
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Gene/Protein 
i Product 


Source 


Nucleotide 
SEQ ID NO 


Amino Acid 
SEQ ID NO 


dxs(16a) 


Methylomonas 16a 


13 


14 


lvtB(16a) 


Methyiomonas 16a 


15 


16 


dxr(16a) 


Methylomonas 16a 


17 


18 



SEQ ID NOs: 19-20 are oligonucleotide primers used to amplify the 
carotenoid biosynthesis genes from P. stewartii. 

SEQ ID NOs:21-32 are oligonucleotide primers used to create 
5 chromosomal integration of the 75 strong promoter (P T5 ) upstream from 
E. coli isoprenoid genes in the present invention. 

SEQ ID NO:33 is the nucleotide sequence of the P T5 promoter 
sequence inserted in pKD4 to create pSUH5. 

SEQ ID NO:34-45 are oligonucleotide primers for creating 
io dxs(16a), dxr(16a), and lytB(16a) gene insertions in the £ coli 
chromosome. 

SEQ ID NO:46-62 are oligonucleotide primers used for screening to 
confirm correct insertion of chromosomal integrations in the present 
invention. 

15 SEQ ID NO:63 is the nucleotide sequence of the yjeR::Tn5 mutant 

gene. 

SEQ ID NO:64 is the nucleotide sequence for plasmid pPCB15. 

SEQ ID NO:65 is the nucleotide sequence for plasmid pKD46. 

SEQ ID NO:66 is the nucleotide sequence for plasmid pSUH5. 
20 BRIEF DESCRIPTION OF BIO LOGICAL DEPOSITS 

The following biological deposit have been made under the terms 
of the Budapest Treaty on the International Recognition of the Deposit of 
Microorganisms for the purposes of Patent Procedure: 



Depositor Identification 

Reference 

Plasmid pCP20 
Methylomonas 16a 

WS#124 E. coli strain Pj^xs Pjsridi P75- 
ygbBP yjeR::Tn5, pPCB15 
WS#208 E. coli strain Pys-dxs PT5' idiP T5r 
ygbBP Prs-ispB, pDCQ108 



InfL Depository 

Designation Date of Deposit 

ATCC# PTA-4455 June 13, 2002 
ATCC# PTA-2402 August 22, 2000 
ATCC# PTA-4807 November 20, 2002 

ATCC# PTA-4823 November 26, 2002 



As used herein, "ATCC" refers to the American Type Culture 
Collection International Depository Authority located at ATCC, 10801 
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University Blvd., Manassas, VA 20110-2209, USA. The "International 
Depository Designation" is the accession number to the culture on deposit 
with ATCC. 

The listed deposits will be maintained in the indicated international 
5 depository for at least thirty (30) years and will be made available to the 
public upon the grant of a patent disclosing it. The availability of a deposit 
does not constitute a license to practice the subject invention in 
derogation of patent rights granted by government action. 

nFT AILED DESCRIPTION OF THE INVENTION 
10 In this disclosure, a number of terms and abbreviations are used. 

The following definitions are provided. 

"Open reading frame" is abbreviated ORF. 
"Polymerase chain reaction" is abbreviated PCR. 
As used herein, an "isolated nucleic acid fragment" is a polymer of 
15 RNA or DNA that is single- or double-stranded, optionally containing 
synthetic, non-natural or altered nucleotide bases. An isolated nucleic 
acid fragment in the form of a polymer of DNA may be comprised of one 
or more segments of cDNA, genomic DNA or synthetic DNA. 

The term "isoprenoid" or "terpenoid" refers to the compounds and 
20 any molecules derived from the isoprenoid pathway including 10 carbon 
terpenoids and their derivatives, such as carotenoids and xanthophylls. 

A "carotene" refers to a hydrocarbon carotenoid. Carotene 
derivatives that contain one or more oxygen atoms, in the form of 
hydroxy-, methoxy-, oxo-, epoxy-, carboxy-, or aldehydic functional groups, 
25 or within glycosides, glycoside esters, or sulfates, are collectively known 
as "xanthophylls". Carotenoids are furthermore described as being 
acyclic, monocyclic, or bicyclic depending on whether the ends of the 
hydrocarbon backbones have been cyclized to yield aliphatic or cyclic ring 
structures (G. Armstrong, (1999) In Comprehensive Natural Products 
30 Chemistry , Elsevier Press, volume 2, pp 32 1 -352). 

The terms U-Red recombination system", "X-Red system" and 
Red recombinase" are used interchangeably to describe a group of 
enzymes encoded by the bacteriophage X genes exo, bet, and gam. The 
enzymes encoded by the three genes work together to increase the rate of 
35 homologous recombination in E. coli, an organism generally considered to 
have a relatively low rate of homologous recombination; especially when 
using linear integration cassettes. The X-Red system facilitates the ability 
to use short regions of homology (1 0-50 bp) flanking linear double- 

7 
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stranded (ds) DNA fragments for homologous recombination. In the 
present method, the X-Red genes are expressed on helper plasmid 
pKD46 (Datsenko and Wanner, PNAS, 97:6640-6645 (2000); SEQ ID 
NO:65). 

5 The terms "Methylomonas 16a strain" and "Methylomonas 16a" are 

used interchangeably and refer to a bacterium (ATCC PTA-2402) of a 
physiological group of bacteria known as methylotrophs, which are unique 
in their ability to utilize methane as a sole carbon and energy source. 
The term u yjeR l refers to the oligo-ribonuclease gene locus. 
io The term "Dxs" refers to the enzyme D-1 -deoxyxylulose 5- 

phosphate encoded by the dxs gene which catalyzes the condensation of 
pyruvate and D-glyceraldehyde 3-phosphate to D-1 -deoxyxylulose 5- 
phosphate (DOXP). 

The terms "Dxr" or u lspC n refer to the enzyme DOXP 
15 reductoisomerase encoded by the dxr or ispC gene that catalyzes the 
simultaneous reduction and isomerization of DOXP to 2-C-methyl-D- 
erythritol-4-phosphate. The names of the gene, dxr or /spC, are used 
interchangeably in this application. The names of gene product, Dxr or 
IspC are used interchangeably in this application. 
20 The term "YgbP" or "IspD" and refers to the enzyme encoded by 

the ygbB or ispD gene that catalyzes the CTP-dependent cytidylation of 2- 
C-methyl-D-erythritol-4-phosphateto 4-diphosphocytidyl-2C-methyl-D- 
erythritol. The names of the gene, ygbP or ispD, are used interchangeably 
in this application. The names of gene product, YgbP or IspD are used 
25 interchangeably in this application. 

The term "YchB" or u lspE" and refers to the enzyme encoded by the 
ychB or ispE gene that catalyzes the ATP-dependent phosphorylation of 
4-diphosphocytidyl-2C-methyl-D-erythritol to 4-diphosphocytidyl-2C- 
methyl-D-erythritol-2-phosphate. The names of the gene, ychB or ispE, 
30 are used interchangeably in this application. The names of gene product, 
YchB or IspE are used interchangeably in this application. 

The term "YgbB" or "IspF" refers to the enzyme encoded by the 
ybgB or ispF gene that catalyzes the cyclization with loss of CMP of 4- 
diphosphocytidyl-2C-methyl-D-erythritolto4-diphosphocytidyl-2C-methyl- 
35 D-erythritol-2-phosphate to 2C-methyl-D-erythritol-2,4-cyclodiphosphate. 
The names of the gene, ygbB or ispF, are used interchangeably in this 
application. The names of gene product, YgbB or IspF are used 
interchangeably in this application. 

8 
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The term "GcpE" or "IspG" refers to the enzyme encoded by the 
gcpE or ispG gene that is involved in conversion of 2C-methyl-D-erythritol- 
2,4-cyclodiphosphate to 1.hydroxy-2-methyl-2-(E)-butenyl 4-diphosphate. 
The names of the gene, gcpE or /spG, are used interchangeably in this 
5 application. The names of gene product, GcpE or IspG are used 
interchangeably in this application. 

The term "LytB" or "IspH" refers to the enzyme encoded by the lytB 
or ispH gene and is involved in conversion of 1-hydroxy-2-methyl-2-(E)- 
butenyl 4-diphosphate to isopentenyl diphosphate (IPP) and dimethylallyl 
10 diphosphate (DMAPP). The names of the gene, lytB or ispH, are used 
interchangeably in this application. The names of gene product, LytB or 
IspH are used interchangeably in this application. 

The term "Idi" refers to the enzyme isopentenyl diphosphate 
isomerase encoded by the idi gene that converts isopentenyl diphosphate 
15 to dimethylallyl diphosphate. 

The term "IspA" refers to the enzyme farnesyl pyrophosphate (FPP) 
synthase encoded by the ispA gene. 

The term "IspB" refers to the enzyme octaprenyl diphosphate 
synthase, which supplies the precursor of the side chain of the isoprenoid 
20 quinones encoded by the ispB gene. 

The term "pPCB15" refers to the plasmid (Figure 7; SEQ ID NO:64) 
containing p-carotene synthesis genes Pantoea crtEXYIB, using as a 
reporter plasmid for monitoring p-carotene production in E. coli genetically 
engineered via the present method. 
25 The term tt pKD46" refers to the plasmid (SEQ ID NO:65; Datsenko 

and Wanner, supra) having GenBank® Accession number AY048746. 
Plasmid pKD46 expresses the components of the X-Red Recombinase 
system. 

The term "pSUH5" refers to the plasmid (Figure 6; SEQ ID NO:66) 
30 that was constructed by cloning a phage .75 promoter (P T5 ) region into the 
A/del restriction endonuclease site of pKD4 (Datsenko and Wanner, 
supra). It was used as a template plasmid for PCR amplification of a 
fused kanamycin selectable marker/phage 75 promoter linear DNA 
nucleotide. 

35 Th e term "triple homologous recombination" in the present 

invention refers to a genetic recombination between two linear (PCR- 
generated) DNA fragments and the target chromosome via their 
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homologous sequences resulting in chromosomal integration of the two 
linear nucleic acid fragments into the target chromosome. 

The term "homology arm" refers to a nucleotide sequence which 
enables homologous recombination between two nucleic acids having 
5 substantially the same nucleotide sequence in a particular region of two 
different nucleic acids. The preferred size range of the nucleotide 
sequence of the homology arm is from about 10 to about 100 nucleotides. 

The term "site-specific recombinase" is used in the present 
invention to describe a system comprised of one or more enzymes which 
io recognize specific nucleotide sequences (recombination target sites) and 
which catalyze recombination between the recombination target sites. 
Site-specific recombination provides a method to rearrange, delete, or 
introduce exogenous DNA. Examples of site-specific recombinases and 
their associated recombination target sites are: Cre-lox, FLP/FR7", R/RS, 
is Gin/gix, Xer/dif, Int/aff, a pSR1 system, a cer system, and a fim system. 
The present invention illustrates the use of a site-specific recombinase to 
remove selectable markers. Antibiotic resistance markers, flanked on 
both sides by FRT recombination target sites, are removed by expression 
of the FLP site-specific recombinase. 
20 The terms "stacking", "combinatorial stacking", "chromosomal 

stacking", and "trait stacking" are used interchangeably and refer to the 
repeated process of stacking multiple genetic traits into one E. coli host 
using bacteriophage P1 transduction in combination with the site-specific 
recombinase system for removal of selection markers (Figure 12). 
25 The term "parallel combinatorial fashion" refers to the P1 

transduction with the P1 lysate mixture made from various donor cells, so 
that multiple genetic traits can move the recipient cell in parallel. 

The term "integration cassette" and "recombination element' refers 
to a linear nucleic acid construct useful for the transformation of a 
30 recombination proficient bacterial host. Recombination elements of the 
invention may include a variety of genetic elements such as selectable 
markers, expressible DNA fragments, and recombination regions having 
homology to regions on a bacterial chromosome or on other 
recombination elements. Expressible DNA fragments can include 
35 promoters, coding sequences, genes, and other regulatory elements 

specifically engineered into the recombination element to impart a desired 
phenotypic change upon recombination. 
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The term "expressible DNA fragment" means any DNA that 
influences phenotypic changes in the host cell. An "expressible DNA 
fragment" may include for example, DNA comprising regulatory elements, 
isolated promoters, open reading frames, coding sequences, genes, or 
5 combinations thereof. 

The term "pDCQ108" refers to the plasmid containing (J-carotene 
synthesis genes Pantoea crtEXYIB used as a reporter plasmid for 
monitoring B-carotene production in E. coli that were genetically 
engineered via the present method (ATCC PTA-4823). 
10 The terms "P T5 promoter" and "phage 75 promoter" are used 

interchangeably and refer to the nucleotide sequence that comprises the 
-10 and -35 consensus sequences, lactose operator (/acO), and 
ribosomal binding site (rbs) from phage 75 (SEQ ID NO:33). 

The term "helper plasmid" refers to either pKD46 encoding A,-Red 
is recombinase or pCP20 encoding FLP site-specific recombinase (ATCC 
PTA-4455; Datsenko and Wanner, supra; and Cherepanov and 
Wackernagel, Gene, 158:9-14 (1995)). 

The term "carotenoid overproducing bacteria" refers to a bacteria of 
the invention which has been genetically modified by the up-regulation or 
20 down-regulation of various genes to produce a carotenoid compound a 
levels greater than the wildtype or unmodified host. 

The term "E. coir refers to Escherichia constrain K-12 derivatives, 
such as MG1655 (ATCC 47076) and MC1061 (ATCC 53338). 

The term "Pantoea stewartii subsp. stewartii' is abbreviated as 
25 "Pantoea stewartir and is used interchangeably with Erwinia stewartii 
(Mergaert et al., IntJ. Syst. Bacterioi, 43:162-173 (1993)). 

The term "Pantoea ananatas" is used interchangeably with Erwinia 
uredovora (Mergaert et al., supra). 

The term "Pantoea crtEXYIB cluster" refers to a gene cluster 
30 containing carotenoid synthesis genes crtEXYIB amplified from Pantoea 
stewartii ATCC 8199. The gene cluster contains the genes crtE, crtX, 
crtY, crtl, and crtB. The cluster also contains a crtZ gene organized in 
opposite orientation and adjacent to crtB gene. 

The term "CrtE" refers to geranylgeranyl pyrophosphate synthase 
35 enzyme encoded by crtE gene which converts trans-trans-farnesyl 
diphosphate + isopentenyl diphosphate to pyrophosphate + 
geranylgeranyl diphosphate. 
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The term "CrtY" refers to lycopene cyclase enzyme encoded by 
crtY gene which converts lycopene to p-carotene. 

The term "Crtl" refers to phytoene dehydrogenase enzyme encoded 
by crtl gene which converts phytoene into lycopene via the intermediaries 
5 of phytofluene, zeta-carotene and neurosporene by the introduction of 4 
double bonds 

The term "CrtB" refers to phytoene synthase enzyme encoded by 
crtB gene which catalyzes reaction from prephytoene diphosphate 
(geranylgeranyl pyrophosphate) to phytoene. 
io The term "CrtX" refers to zeaxanthin glucosyl transferase enzyme 

encoded by crtX gene which converts zeaxanthin to zeaxanthin-p- 
diglucoside. 

The term "CrtZ" refers to the p-carotene hydroxylase enzyme 
encoded by crtZ gene which catalyses hydroxylation reaction from p- 

15 carotene to zeaxanthin. 

The term "carotenoid biosynthetic pathway" refers to those genes 
comprising members of the upper and/or lower isoprenoid pathways of the 
present invention as illustrated in Figure 1 . In the present invention, the 
terms "upper isoprenoid pathway" and "upper pathway" will be use 

20 interchangeably and will refer the enzymes involved in converting pyruvate 
and glyceraldehyde-3-phosphate to famesyl pyrophosphate (FPP). These 
enzymes include, but are not limited to Dxs, Dxr (IspC), YgpP (IspD), 
YchB (IspE), YgbB (IspF), GcpE (IspG), LytB (IspH), Idi, IspA, and 
optionally IspB. In the present invention, the terms "lower carotenoid 

25 pathway" and "lower pathway" will be used interchangeably and refer to 
those enzymes which convert FPP to carotenoids, especially p-carotene 
(Figure 1). The enzymes in this pathway include, but are not limited to 
CrtE, CrtY, Crtl, CrtB, CrtX, and CrtZ. In the present invention, the "lower 
pathway" genes are expressed on reporter plasmids pPCB15 or 

30 pDCQ108. 

The term "carotenoid biosynthetic enzyme" is an inclusive term 
referring to any and all of the enzymes encoded by the Pantoea crtEXYIB 
cluster. The enzymes include CrtE, CrtY, Crtl, CrtB, and CrtX. 
The terms "P1 donor cell" and "donor cell" are used 
35 interchangeably in the present invention and refer to a bacterial strain 
susceptible to infection by a bacteriophage or virus, and which serves as a 
source for the nucleic acid fragments packaged into the transducing 
particles. Typically the genetic make up of the donor cell is similar or 

12 



WO 2004/056975 



PCTAJS2003/041812 



identical to the "recipient cell" which serves to receive P1 lysate containing 
transducing particles or virus produced by the donor cell. 

The terms "P1 recipient cell" and "recipient cell" are used 
interchangeably in the present invention and refer to a bacterial strain 
5 susceptible to infection by a bacteriophage or virus and which serves to 
receive lysate containing transducing particles or virus produced by the 
donor cell. 

"Synthetic genes" can be assembled from oligonucleotide building 
blocks that are chemically synthesized using procedures known to those 

10 skilled in the art. These building blocks are ligated and annealed to form 
gene segments which are then enzymatically assembled to construct the 
entire gene. "Chemically synthesized", as related to a sequence of DNA, 
means that the component nucleotides were assembled in vitro. Manual 
chemical synthesis of DNA may be accomplished using well-established 

15 procedures, or automated chemical synthesis can be performed using one 
of a number of commercially available machines. Accordingly, the genes 
can be tailored for optimal gene expression based on optimization of 
nucleotide sequence to reflect the codon bias of the host cell. The skilled 
artisan appreciates the likelihood of successful gene expression if codon 

20 usage is biased towards those codons favored by the host. Determination 
of preferred codons can be based on a survey of genes derived from the 
host cell where sequence information is available. 

"Gene" refers to a nucleic acid fragment that expresses a specific 
protein, including regulatory sequences preceding (5' non-coding 

25 sequences) and following (3' non-coding sequences) the coding 

sequence. "Native gene" refers to a gene as found in nature with its own 
regulatory sequences. "Chimeric gene" refers to any gene that is not a 
native gene, comprising regulatory and coding sequences that are not 
found together in nature. Accordingly, a chimeric gene may comprise 

30 regulatory sequences and coding sequences that are derived from 

different sources, or regulatory sequences and coding sequences derived 
from the same source, but arranged in a manner different than that found 
in nature. "Endogenous gene" refers to a native gene in its natural 
location in the genome of an organism. A "foreign" gene refers to a gene 

35 not normally found in the host organism, but that is introduced into the 
host organism by gene transfer. Foreign genes can comprise native 
genes inserted into a non-native organism, or chimeric genes. A 
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"transgene" is a gene that has been introduced into the genome by a 
transformation procedure. 

The term "genetic end product" means the substance, chemical or 
material (i.e. isoprenoids, carotenoids) that is produced as the result of the 
5 activity of a gene product. Typically a gene product is an enzyme and a 
genetic end product is the product of that enzymatic activity on a specific 
substrate. A genetic end product may the result of a single enzyme activity 
or the result of a number of linked activities, such as found in a 
biosynthetic pathway (several enzyme actiyites). 
10 "Operon", in bacterial DNA, is a cluster of contiguous genes 

transcribed from one promoter that gives rise to a polycistronic mRNA. 

"Coding sequence" refers to a DNA sequence that codes for a 
specific amino acid sequence. "Suitable regulatory sequences" refer to 
nucleotide sequences located upstream (5 1 non-coding sequences), 
15 within, or downstream (3' non-coding sequences) of a coding sequence, 
and which influence the transcription, RNA processing or stability, or 
translation of the associated coding sequence. Regulatory sequences 
may include promoters, translation leader sequences, introns, 
polyadenylation recognition sequences, RNA processing site(s), effector 
20 binding site(s), and stem-loop structure(s). 

"Promoter" refers to a DNA sequence capable of controlling the 
expression of a coding sequence or functional RNA. In general, a coding 
sequence is located 3' to a promoter sequence. Promoters may be 
derived in their entirety from a native gene, or be composed of different 
25 elements derived from different promoters found in nature, or even 

comprise synthetic DNA segments. It is understood by those skilled in the 
art that different promoters may direct the expression of a gene in different 
tissues or cell types, or at different stages of development, or in response 
to different environmental or physiological conditions ("inducible 
30 promoters"). Promoters which cause a gene to be expressed in most cell 
types at most times are commonly referred to as "constitutive promoters". 
Promoters can be further classified by the relative strength of expression 
observed by their use (i.e. weak, moderate, or strong). It is further 
recognized that since in most cases the exact boundaries of regulatory 
35 sequences have not been completely defined, DNA fragments of different 
lengths may have identical promoter activity. 
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The "3' non-coding sequences" refer to DNA sequences located 
downstream of a coding sequence and include regulatory signals capable 
of affecting mRNA processing or gene expression. 

"RNA transcript 0 refers to the product resulting from RNA 

5 polymerase-catalyzed transcription of a DNA sequence. When the RNA 
transcript is a perfect complementary copy of the DNA sequence, it is 
referred to as the primary transcript or it may be a RNA sequence derived 
from post-transcriptional processing of the primary transcript and is 
referred to as the mature RNA. "Messenger RNA (mRNA)" refers to the 

10 RNA that is without introns and that can be translated into protein by the 
cell. "Sense" RNA refers to RNA transcript that includes the mRNA and so 
can be translated into protein by the cell. "Antisense RNA" refers to an 
RNA transcript that is complementary to all or part of a target primary 
transcript or mRNA and that blocks the expression of a target gene 

15 (US 5,1 07,065; WO 99/28508). The complementarity of an antisense 
RNA may be with any part of the specific gene transcript, i.e., at the 
5' non-coding sequence, 3' non-coding sequence, or the coding sequence. 
"Functional RNA" refers to antisense RNA, ribozyme RNA, or other RNA 
that is not translated yet has an effect on cellular processes. 

20 The term "operably linked" refers to the association of nucleic acid 

sequences on a single nucleic acid fragment so that the function of one is 
affected by the other. For example, a promoter is operably linked with a 
coding sequence when it is capable of affecting the expression of that 
coding sequence (i.e., that the coding sequence is under the 

25 transcriptional control of the promoter). Coding sequences can be 

operably linked to regulatory sequences in sense or antisense orientation. 

The term "expression", as used herein, refers to the transcription 
and stable accumulation of sense (mRNA) or antisense RNA derived from 
the nucleic acid fragment of the invention. Expression may also refer to 

30 translation of mRNA into a polypeptide. 

"Transformation" refers to the transfer of a nucleic acid fragment 
into the genome of a host organism, resulting in genetically stable 
inheritance. Host organisms containing the transformed nucleic acid 
fragments are referred to as "transgenic", "recombinant" or "transformed" 

35 organisms. 

The terms "transduction" and "generalized transduction" are used 
interchangeably and refer to a phenomenon in which bacterial DNA is 
transferred from one bacterial cell (the donor) to another (the recipient) by 
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a phage particle containing bacterial DNA (Figure 12). The bacterial DNA 
fragment from the donor can undergo homologous recombination with the 
recipient cell's chromosome, stably integrating the donor cell's DNA 
fragment into the recipient's chromosome. 
5 The terms "plasmid", "vector" and "cassette" refer to an extra 

chromosomal element often carrying genes which are not part of the 
central metabolism of the cell, and usually in the form of circular double- 
stranded DNA fragments. Such elements may be autonomously 
replicating sequences, genome integrating sequences, phage or 
io nucleotide sequences, linear or circular, of a single- or double-stranded 
DNA or RNA, derived from any source, in which a number of nucleotide 
sequences have been joined or recombined into a unique construction 
which is capable of introducing a promoter fragment and DNA sequence 
for a selected gene product along with appropriate 3 1 untranslated 
15 sequence into a cell. "Transformation cassette" refers to a specific vector 
containing a foreign gene and having elements in addition to the foreign 
gene that facilitates transformation of a particular host cell. "Expression 
cassette" refers to a specific vector containing a foreign gene and having 
elements in addition to the foreign gene that allow for enhanced 
20 expression of that gene in a foreign host. 

The term "sequence analysis software" refers to any computer 
algorithm or software program that is useful for the analysis of nucleotide 
or amino acid sequences. "Sequence analysis software" may be 
commercially available or independently developed. Typical sequence 
25 analysis software will include but is not limited to the GCG suite of 
programs (Wisconsin Package Version 9.0, Genetics Computer Group 
(GCG), Madison, Wl), BLASTP, BLASTN, BLASTX (Altschul et al., J. Mol. 
Biol 215:403-410 (1990), and DNASTAR (DNASTAR, Inc. 1228 S. Park 
St. Madison, Wl 53715 USA), and the FASTA program incorporating the 
30 Smith-Waterman algorithm (W. R. Pearson, Comput Methods Genome 
Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): 
Suhai, Sandor. Publisher: Plenum, New York, NY. Within the context of 
this application it will be understood that where sequence analysis 
software is used for analysis, that the results of the analysis will be based 
35 on the "default values" of the program referenced, unless otherwise 

specified. As used herein "default values" will mean any set of values or 
parameters which originally load with the software when first initialized. 
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The present invention relates to carotenoid overproducing bacteria. 
The genes of the isoprenoid pathway in the bacterial hosts of the invention 
have been engineered such that certain'genes are either up-regulated or 
down regulated resulting in the production of carotenoid compounds at a 

5 higher level than is found in the unmodified host. In some instances the 
genes that are regulated are directly involved in the carotenoid 
biosynthetic pathway. In other instances the genes involved are 
chromosomal genes that have no understood relationship to the 
carotenoid biosynthetic pathway. 

10 It has been found that over-expression of certain combinations of 

carotenoid biosynthetic genes will give an unexpectedly high level of 
carotenoid production. Examples of genes useful in this manner which 
are part of the carotenoid biosynthetic pathway are the dxs gene, 
(catalyzing the condensation of pyruvate and D-glyceraldehyde 3- 

15 phosphate to D-1 -deoxyxylulose 5-phosphate), the idi gene (converting 
isopentenyl diphosphate to dimethylallyl diphosphate), the ygbB (ispF) 
gene (catalyzing the cyclization with loss of CMP of 4-diphophocytidyl-2C- 
methyl-D-erythritol to 4-diphosphocytidyl-2C-methyl-D-erythritol-2- 
phosphate to 2C-methyl-D-erythritol-2,4-cyclodiphosphate), the ygbP 

20 (ispD ) gene (catalyzeing the CTP-dependent cytidylation of 2-C-methyl-D- 
erythritol-4-phosphate to 4-diphophocytidyl-2C-methyl-D-erythritol) and 
together referred to as the ygbBP gene, the lytB (ispH) gene (involved in 
conversion of 2C-methy!-D-erythritol-2,4-cyclodiphosphate to dimethylallyl 
diphosphate and isopentenyl diphosphate), and the ispB gene encoding 

25 the enzyme octaprenyl diphosphate synthase. When these genes are 
selectively over expressed under the control of a strong promoter the 
result is an unexpectedly high level of carotenoid production. It is 
important to note that it is the combination of the over-expression of these 
genes that has been shown to give the desired effect. 

30 Alternatively, it has also been found that certain essential 

chromosomal genes, when mutated, will alter the output of the carotenoid 
biosynthetic pathway. One such gene is the yjeR gene (defining a oligo- 
ribonuclease locus). It has been found that a partial mutation in this gene 
will unexpectedly increase carotenoid production in a host cell capable of 

35 cartenoid biosynthesis. 

Genes Involved in Carotenoid Production. 

The enzyme pathway involved in the biosynthesis of carotenoids 
can be conveniently viewed in two parts, the upper isoprenoid pathway 
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providing for the conversion of pyruvate and gIyceraldehyde-3-phosphate 
to farnesyl pyrophosphate (FPP) and the lower carotenoid biosynthetic 
pathway, which provides for the synthesis of phytoene and all 
subsequently produced carotenoids. The upper pathway is ubiquitous in 

5 many non-carotogenic microorganisms and in these cases it will only be 
necessary to introduce genes that comprise the lower pathway for the 
biosynthesis of the desired carotenoid. The key division between the two 
pathways concerns the synthesis of farnesyl pyrophosphate. Where FPP 
is naturally present, only elements of the lower carotenoid pathway will be 

10 needed. However, it will be appreciated that for the lower pathway 

carotenoid genes to be effective in the production of carotenoids, it will be 
necessary for the host cell to have suitable levels of FPP within the cell. 
Where FPP synthesis is not provided by the host cell, it will be necessary 
to introduce the genes necessary for the production of FPP, Each of 

15 these pathways will be discussed below in detail. 
The Upper Isoprenoid Pathway 

Isoprenoid biosynthesis occurs through either of two pathways, 
generating the common C5 isoprene sub-unit, isopentenyl pyrophosphate 
(IPP). First, IPP may be synthesized through the well-known 

20 acetate/mevalonate pathway. However, recent studies have 

demonstrated that the mevalonate-dependent pathway does not operate 
in all living organisms. An alternate mevalonate-independent pathway for 
IPP biosynthesis has been characterized in bacteria and in green algae 
and higher plants (Horbach et al., FEMS Microbiol. Lett., 1 1 1 :135-140 

25 (1993); Rohmer et al, Biochem., 295: 517-524 (1993); Schwender et al., 
Biochem., 316: 73-80 (1996); and Eisenreich et al, Proc. Natl. Acad. Sci. 
USA, 93: 6431-6436 (1996)). 

Many steps in the mevalonate-independent isoprenoid pathway are 
known (Figure 1). For example, the initial steps of the alternate pathway 

30 leading to the production of IPP have been studied in Mycobacterium 
tuberculosis by Cole et al. (Nature, 393:537-544 (1998)). The first step of 
the pathway involves the condensation of two 3-carbon molecules 
(pyruvate and D-glyceraldehyde 3-phosphate) to yield a 5-carbon 
compound known as D-1-deoxyxylulose-5-phosphate. This reaction 

35 occurs by the DXS enzyme, encoded by the dxs gene. Next, the 

isomerization and reduction of D-1-deoxyxylulose-5-phosphate yields 2-C- 
methyl-D-erythritol-4-phosphate. One of the enzymes involved in the 
isomerization and reduction process is D-1-deoxyxylulose-5-phosphate 
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reductoisomerase (DXR), encoded by the gene dxr{ispC). 2-C-methyl-D- 
erythritol-4-phosphate is subsequently converted into 4-diphosphocytidyl- 
2C-methyl-D-erythritol in a CTP-dependent reaction by the enzyme 
encoded by the non-annotated gene ygbP. Recently, however, the ygbP 
5 gene was renamed as ispD as a part of the isp gene cluster (SwissProtein 
Accession #Q46893). 

Next, the 2 nd position hydroxy group of 4-diphosphocytidyl-2C- 
methyl-D-erythritol can be phosphorylated in an ATP-dependent reaction 
by the enzyme encoded by the ychB gene. YchB phosphorylates 
io 4-diphosphocytidyl-2C-methyl-D-erythritol, resulting in 4-diphosphocytidyl- 
2C-methyl-D-erythritol 2-phosphate. The ychB gene was renamed as 
/spE, also as a part of the isp gene cluster (SwissProtein Accession 
#P24209). YgbB converts 4-diphosphocytidyl-2C-methyl-D-erythritol 2- 
phosphate to 2C-methyl-D-erythritol 2,4-cyclodiphosphate in a CTP- 
15 dependent manner. This gene has also been recently renamed, and 
belongs to the isp gene cluster. Specifically, the new name for the ygbB 
gene is ispF (SwissProtein Accession #P36663). 

The enzymes encoded by the gcpE (i$pG) and lytB (ispH) genes 
(and perhaps others) are thought to participate in the reactions leading to 
20 formation of isopentenyl pyrophosphate (IPP) and dimethylallyl 

pyrophosphate (DMAPP). IPP may be isomerized to DMAPP via IPP 
isomerase, encoded by the idi gene. However, this enzyme is not 
essential for survival and may be absent in some bacteria using 2-C- 
methyl-D-erythritol 4-phosphate (MEP) pathway. Recent evidence 
25 suggests that the MEP pathway branches before IPP and separately 
produces IPP and DMAPP via the lytB gene product. A lytB knockout 
mutation is lethal in E. coli except in media supplemented with both IPP 
and DMAPP. 

The synthesis of FPP occurs via the isomerization of IPP to 
30 dimethylallyl pyrophosphate. This reaction is followed by a sequence of 
two prenyltransferase reactions catalyzed by ispA, leading to the creation 
of geranyl pyrophosphate (GPP; a 10-carbon molecule) and farnesyl 
pyrophosphate (FPP; a 15-carbon molecule). 

Genes encoding elements of the upper pathway are known from a 
35 variety of plant, animal, and bacterial sources, as shown in Table 1. 
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Table 1 

Sources of Genes Encoding the Upper Isoprene Pathway 



Gene 


GenBank Accession Number and 
Source Organism 


dxs (D-1- 
deoxyxylulose 5- 
phosphate 
synthase) 


AF035440, Escherichia coli 
Y18874, Synechococcus PCC6301 
AB026631, Streptomyces sp. CL190 
AB042821, Streptomyces griseolosporeus 
AF1 1 1814, Plasmodium falciparum 
AF143812, Lycopersicon esculentum 

AJ^ / yU I y , /VafU/Souo pSGUuui tat woouo 

AJ291721, Nicotians tabacum 


dxr(/spC)(1- 

deoxy-D- 

xylulose 5- 

phosphate 

reductoisomeras 

e) 


AB013300, Escherichia coli 
AB049187, Streptomyces griseolosporeus 
AF1 1 1813, Plasmodium falciparum 
AF1 16825, Mentha x piperita 

APIAoot)^, AraDIQOpSIS inallana 

AF1 82287, Artemisia annua 
AF250235, Catharanthus roseus 
AF282879, Pseudomonas aeruginosa 

A IO/10*^QQ /lrah/Wnnc/c thaliana 
AJZ4^O0O, Arauiuupolo uiaiiaiia 

AJ250714, Zymomonas mobilis strain ZM4 
AJ29231 2, Klebsiella pneumoniae, 
AJ297566, Zea mays 


ygbP(ispD){2- 

C- methyl-D- 
erythntol 4- 
phosphate 
cytidylyltransfera 
se) 


AB037876, Arabidopsis thaliana 
AF 109075, Clostridium difficile 
Ar^ou/oo, tzscnencnia con 
AF230737, Arabidopsis thaliana 


ychB (ispE) (4~ 

diphosphocytidyl 
-2-C-methyl-D- 

on/thritnl kina^p\ 


AF216300, Escherichia coli 
AF263101, Lycopersicon esculentum 
AF288615, Arabidopsis thaliana 


ygbB (ispF) (2- 
C-methyl-D- 
erythritol 2,4- 
cyclodiphosphat 
e synthase) 


AB038256, Escherichia colimecs gene 
AF230738, Escherichia coli 
AF250236, Catharanthus roseus (MECS) 
AF279661 , Plasmodium falciparum 
AF321 531 , Arabidopsis thaliana 


gcpE{ispG)(1- 

hydroxy-2- 
methyl-2-(E)- 
butenyl 4- 
diphosphate 
synthase) 


067496, Aquifex aeolicus 
P54482, Bacillus subtilis 
Q9pky3, Chlamydia muridarum 
Q9Z8H0, Chlamydophila pneumoniae 
084060, Chlamydia trachomatis 
P27433, Escherichia coli 
P44667, Haemophilus influenzae 
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Gene 


GenBank Accession Number and 
Source Organism 




no 71 1 n H&lirnharf&r nvlnri J 99 

033350, Mycobacterium tuberculosis 
S77159, Synechocystis sp. 
Q9WZZ3, Thermotoga maritima 
083460, Treponema pallidum 
Q9JZ40, Neisseria meningitidis 
Q9PPM1, Campylobacter jejuni 

UyrxAv/y, UeinOCOUOUo faUIUUUiauo 

AAG07190, Pseudomonas aeruginosa 
Q9KTX1, Vibrio cholerae 


lytB {ispH) 


AF027189, Acinetobactersp. BD413 

AF098521 , Burkholderia pseudomallei 

AF291696, Streptococcus pneumoniae 

AF323927, Plasmodium falciparum gene 

M87645, Bacillus subtillis I 

U38915, Synechocystis sp, 

X89371. C. ieiunisp 067496 


IspA (FPP 
synthase) 


AB003187, Micrococcus luteus 

AB016094, Synechococcus elongatus 

AB021747, Oryza sativa FPPS1 gene for farnesyl 

diphosphate synthase 

AB028044, Rhodobacter sphaeroides 

AB028046, Rhodobacter capsulatus 

AB028047, Rhodovulum sulfidophilum 

AF1 12881 and AF1 36602; Artemisia annua 

AF384040, Mentha x piperita 

D00694, Escherichia coli 

D13293, B. stearothermophilus 

D85317, Oryza sativa 

X75789, A. thaliana 

Y12072, G. arboreum 

Z49786, H. brasiliensis 

U80605, Arabidopsis thaliana farnesyl diphosphate 
synthase precursor (FPS1) mRNA, complete cds 
X76026, K. lactis FPS gene for farnesyl diphosphate 
synthetase, QCR8 gene for bd complex, subunit VIII 
X82542, P.argentatum mRNA for farnesyl diphosphate 
wnthase (FPS1^ 

X82543, P. argentatum mRNA for farnesyl diphosphate 
synthase (FPS2) 

BC010004, Homo sapiens, farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
dimethylallyltranstransferase, geranyltranstransferase), 
clone MGC 15352 IMAGE, 4132071, mRNA, complete 
cds 

AF234168, Dictyostelium discoideum farnesyl 
diphosphate synthase (Dfps) 
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Gene 


GenBank Accession Number and 
Source Orqanism 




L46349, Arabidopsis thaliana farnesyl diphosphate 
synthase (FPS2) mRNA, complete cds 

L46350, Arabidopsis thaliana farnesyl diphosphate 
synthase {FPS2) gene, complete cds 

L46367, Arabidopsis thaliana farnesyl diphosphate 
synthase (FPS1) gene, alternative products, complete 
cds 

M89945, Rat farnesyl diphosphate synthase gene, 
exons 1-8 

NMJJ02004, Homo sapiens farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
dimethylallyltranstransferase, geranyltranstransferase) 
(FDPS), mRNA 

U36376, Artemisia annua farnesyl diphosphate 
synthase (fps1) mRNA, complete cds 
XM_001352, Homo sapiens farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
dimethylallyltranstransferase, geranyltranstransferase) 
(FDPS), mRNA 

XMJJ34497, Homo sapiens farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
dimethylallyltranstransferase, geranyltranstransferase) 
(FDPS\ mRNA 

XMJD34498, Homo sapiens farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
oimetnyiaiiyiiransiransierdoc, yci ai lyiucti iauai isiwaoc/ 
(FDPS), mRNA 

XM_034499, Homo sapiens farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
dimethylallyltranstransferase, geranyltranstransferase) 
{FDPS), mRNA 

XM_0345002, Homo sapiens farnesyl diphosphate 
synthase (farnesyl pyrophosphate synthetase, 
dimethylallyltranstransferase, geranyltranstransferase) 
(FDPS), mRNA 



The most preferred source of genes for the upper isoprene 
pathway in the present invention is from Methylomonas 16a (ATCC PTA- 
2402). Methylomonas 16a is particularly well-suited for the present 
5 invention, as the methanotroph is naturally pink-pigmented, producing a 
30-carbon carotenoid. Thus, the organism possesses the genes of the 
upper isoprene pathway. Sequences of these preferred genes are 
presented as the following SEQ ID numbers: the dxs(16a) gene (SEQ ID 
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NO:13), the dxr(16a) gene (SEQ ID N0:17), and the lytB(16a) gene (SEQ 
IDN0:15). 

The Lower Carotenoid Biosvnthetic Pathway 
The division between the upper isoprenoid pathway and the lower 
5 carotenoid pathway is somewhat subjective. Because FPP synthesis is 
common in both carotenogenic and non-carotenogenic bacteria, the first 
step in the lower carotenoid biosynthetic pathway is considered to begin 
with the prenyltransferase reaction converting farnesyl pyrophosphate 
(FPP) to geranylgeranyl pyrophosphate (GGPP). The gene crfE, 
10 encoding GGPP synthetase, is responsible for this prenyltransferase 
reaction which adds IPP to FPP to produce the 20-carbon molecule 
GGPP. A condensation reaction of two molecules of GGPP occurs to 
form phytoene (PPPP), the first 40-carbon molecule of the lower 
carotenoid biosynthesis pathway. This enzymatic reaction is catalyzed by 
is crfB, encoding phytoene synthase. 

Lycopene, which imparts a "red" colored spectra, is produced from 
phytoene through four sequential dehydrogenation reactions by the 
removal of eight atoms of hydrogen, catalyzed by the gene oil (encoding 
phytoene desaturase). Intermediaries in this reaction are phytofluene, 
20 zeta-carotene, and neurosporene. 

Lycopene cyclase (crtY) converts lycopene to p-carotene. In the 
present invention, a reporter plasmid is used which produces p-carotene 
. as the genetic end product. However, additional genes may be used to 
create a variety of other carotenoids. For example, p-carotene is 
25 converted to zeaxanthin via a hydroxylation reaction resulting from the 
activity of p-carotene hydroxylase (encoded by the crtZ gene), p- 
cryptoxanthin is an intermediate in this reaction. 

p-carotene is converted to canthaxanthin by p-carotene ketolase 
encoded by either the crtW or crtO gene. Echinenone in an intermediate in 
30 this reaction. Canthaxanthin can then be converted to astaxanthin by p- 
carotene hydroxylase encoded by the crtZ or crfRgene. Adonbirubrin is 
an intermediate in this reaction. 

Zeaxanthin can be converted to zeaxanthin-p-diglucoside. This 
reaction is catalyzed by zeaxanthin glucosyl transferase (crtX). 
35 Zeaxanthin can be converted to astaxanthin by p-carotene 

ketolase encoded by crtW, crtO or bkt. The BKT/CrtW enzymes 
synthesized canthaxanthin via echinenone from p-carotene and 4- 
ketozeaxanthin. Adonixanthin is an intermediate in this reaction. 
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Spheroidene can be converted to spheroidenone by spheroidene 
monooxygenase encoded by crtA. 

Neurosporene can be converted spheroidene and lycopene can be 
converted to spirilloxanthin by the sequential actions of 
5 hydroxyneurosporene synthase, methoxyneurosporene desaturase and 
hydroxyneurosporene-O-methyltransferase encoded by the crfC, crtD and 

crtF genes, respectively. 

p-carotene can be converted to isorenieratene by p-carotene 
desaturase encoded by crtU . 
10 Genes encoding elements of the lower carotenoid biosynthetic 

pathway are known from a variety of plant, animal, and bacterial sources, 
as shown in Table 2. 



Table 2 

15 Sources of Genes Encoding the Lower Caro tenoid Biosynthetic Pathway 



Gene 


GenBank Accession Number and 
Source Organism 


crtE (GGPP 
Synthase) 


AB000835, Arabidopsis thaliana 
AB016043 and AB019036, Homo sapiens 
AB01 6044, Mus musculus 
AB027705 and AB027706, Daucus carota 
AB034249, Croton sublyratus 
AB034250, Scoparia dulcis 
AF020041, Helianthus annuus 
AF049658, Drosophila melanogaster signal 
recognition particle 19kDa protein (srp19) gene.partial 
sequence; and geranylgeranyl pyrophosphate 
synthase {quemao) gene,complete cds 
AF049659, Drosophila melanogaster geranylgeranyl 
pyrophosphate synthase mRNA, complete cds 
AF1 39916, Brevibacterium linens 
AF279807, Penicillium paxilli geranylgeranyl 
pyrophosphate synthase (ggsl) gene, complete 
AF279808, Penicillium paxilli dimethylallyl tryptophan 
synthase (paxD) gene, partial cds;and cytochrome 
P450 monooxygenase (paxQ), cytochrome P450 
monooxygenase (paxP),PaxC (paxC), 
monooxygenase (paxM), geranylgeranyl 
pyrophosphate synthase (paxG),PaxU (paxU), and 
metabolite transporter (paxT) genes, complete cds 
AJ010302, Rhodobacter sphaeroides 
AJ 133724, Mycobacterium aurum 
AJ276129, Mucor circinelloides f. lusitanicus carG 
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Gene 


GenBank Accession Number and 
Source Organism 




gene for geranylgeranyl pyrophosphate synthase, 
exons 1-6 

D85029, Arabidopsis thaliana mRNA for 

geranylgeranyl pyrophosphate synthase, partial cds 

L25813, Arabidopsis thaliana 

L37405, Streptomyces griseus geranylgeranyl 

pyrophosphate synthase (crfB), phytoene desaturase 

(crtE) and phytoene synthase (crfl) genes, complete 

cds 

U 15778, Lupinus aiuus geranyigeranyi 
pyrophosphate synthase (ggpsl) mRNA, complete 
cds 

U44876, Arabidopsis thaliana pregeranylgeranyl 
pyrophosphate synthase (GGPS2) mRNA, complete 
cds 

X92893, C. roseus 
X95596, S. griseus 
X98795, S. alba 
Y1 51 1 2, Paracocous marcusii 


criX (Zeaxanthin 
glucosylase) 


D90087, E. uredovora 

M87280 and M90698, Pantoea agglomerans 


c/tY (Lycopene-p- 
cyclase) 


AF1 39916, Brevibacterium linens 

AF1 52246, Citrus x paradisi 

AF21 841 5, Bradyrtiizobium sp. ORS278 

AF272737, Streptomyces griseus strain IFO13350 

A J 133724, Mycobacterium aurum 

AJ250827, Rhizomucorcircinelloidest lusitanicus 

carRP gene for lycopene cyclase/phytoene synthase, 

exons 1-2 

AJ276965, Phycomyces blakesleeanus carRA gene 

for phytoene synthase/lycopene cyclase, exons 1-2 

D58420, Agrobacterium aurantiacum 

D83513, Erythrobacter longus 

L40176, Arabidopsis thaliana lycopene cyclase 

(LYC) mRNA, complete cds 

M87280, Pantoea agglomerans 

U50738, Arabodopsis thaliana lycopene epsilon 

cyclase mRNA, complete cds 

U50739, Arabidosis thaliana lycopene p cyclase 

mRNA, complete cds 

U62808, Flavobacterium ATCC21588 

X74599, Synechococcus sp. Icy gene for lycopene 

cyclase 

X81787, N. tabacum CrtL-1 gene encoding lycopene 
cyclase 

X86221, C. annuum 

X86452, L esculentum mRNA for lycopene B-cyclase 
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Gene 


GenBank Accession Number and 
Source Organism 




X95596, S. griseus 
X98796, N. pseudonarcissus 


crtl (Phytoene 
desaturase) 


AB046992, Citrus unshiu CitPDSI mRNAfor 

phytoene desaturase, complete cds 

AF039585, Zea mays phytoene desaturase {pdsl) 

gene promoter region and exon 1 

AF049356, Oryza sativa phytoene desaturase 

precursor {Pds) mRNA, complete cds 

AF139916, Brevibacterium linens 

AF218415, Bradyrhizobium sp. ORS278 

AF251014, Tagetes erecta 

AF364515, Citms x paradisi 

D58420, Agrobacterium aurantiacum 

D83514, Erythrobacterlongus 

L16237, Arabidopsis thaliana 

L37405, Streptomyces griseus geranylgeranyl 

pyrophosphate synthase (crfS), phytoene desaturase 

(crtE) and phytoene synthase {crtl) genes, complete 

cds 

L39266, Zea mays phytoene desaturase {Pds) 

mRNA, complete cds 

M64704, Soybean phytoene desaturase 

M88683, Lycopersicon esculentum phytoene 

desaturase (pds) mRNA, complete cds 

S71770, carotenoid gene cluster 

U37285, Zea mays 

U46919, Solanum lycopersicum phytoene desaturase 

{Pds) gene, partial cds 

U62808, Flavobacterium ATCG21 588 

X55289, Synechococcus pds gene for phytoene 

desaturase 

X59948, L esculentum 

X62574, Synechocystis sp. pds gene for phytoene 
dgggtu rase 

X68058, C. annuumpdsl mRNAfor phytoene 
desaturase 

X71023, Lycopersicon esculentum pds gene for 
phytoene desaturase 

X78271, L esculentum (Ailsa Craig) PDS gene 

X78434, P. blakesleeanus (NRRL1555) carB gene 

X78815, N. pseudonarcissus 

X86783, H. pluvialis 

Y14807, Dunaliella bardawil 

Y15007, Xanthophyllomyces dendrorhous 

Y151 12, Paracoccus marcusii 

Y15114, Anabaena PCC7210 crtP gene 

Z11165, R. capsulatus 
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Gene 1 GenBank Accession Number and 

Source Organism 

crtB (Phytoene AB001284, Spirulina platensis 

synthase) AB032797, Daucus carota PSY mRNA for phytoene 

synthase, complete cds 

AB034704, Rubrivivax gelatinosus 

AB037975, Citrus unshiu 

AF009954, Arabidopsis thaliana phytoene synthase 

(PSY) gene, complete cds 

AF1 39916, Brevibacterium linens 

AF1 52892, Citrus x paradisi 

AF218415, Bradyrhizobium sp. ORS278 

AF220218, Citrus unshiu phytoene synthase (Psy1) 

mRNA, complete cds 

AJ010302, Rhodobacter 

AJ 133724, Mycobacterium aurum 

AJ278287, Phycomyces blakesleeanus carRA gene 

for lycopene cyclase/phytoene synthase, 

AJ304825, Helianthus annuus mRNA for phytoene 

synthase (psy gene) 

AJ308385, Helianthus annuus mRNA for phytoene 

synthase (psy gene) 

D58420, Agrobacterium aurantiacum 

L23424, Lycopersicon esculentum phytoene synthase 

{PSY2) mRNA, complete cds 

L25812, Arabidopsis thaliana 

L37405, Streptomyces griseus geranylgeranyl 

pyrophosphate synthase (crtB) t phytoene desaturase 

(crtE) and phytoene synthase (crtl) genes, complete 

cds 

M38424, Pantoea agglomerans phytoene synthase 

(crtE) gene, complete cds 

M87280, Pantoea agglomerans 

S71770, Carotenoid gene cluster 

U32636, Tea mays phytoene synthase (Y1 ) gene, 

complete cds 

U62808, Flavobacterium ATCC21588 

U87626, Rubrivivax gelatinosus 

U91900, Dunaliella bardawil 

X5229 1 , Rhodobacter capsulatus 

X60441, L esculentum GTom5 gene for phytoene 

synthase 

X63873, Synechococcus PCC7942 pys gene for 
phytoene synthase 

X68017, C, annuum psyl mRNA for phytoene 
synthase 

X69172, Synechocystis sp. pys gene for phytoene 
synthase 

X78814, A/, pseudonarcissus 
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Gene 


GenBank Accession Number and 
Source Organism 


crtZ (p-carotene 
hydroxylase) 


D58420, Agrobactehum aurantiacum 
D58422, Alcaligenes sp. 
D90087, c. ureoovora 
M87280, Pantoea agglomerans 

Y1 51 1 2, Paracoccus marcusii 


crtW (p-carotene 
ketolase) 


AF218415, Bradyrhizobium sp. ORS278 

D45881, Haematococcus pluvialis 

D58420, Agrobactehum aurantiacum 

D58422, Alcaligenes sp. 

X86782, H. pluvialis 

Y1 51 1 2, Paracoccus marcusii 


crtO (p-C4- 
ketolase) 


X86782, H.pluvialis 
Y15112, Paracoccus marcusii 


crtU (p-carotene 
dehydrogenase) 


AF047490, Zea mays 

AF121947, Arabidopsis thaliana 

AF139916, Brevibacterium linens 

AF 195507, Lycopersicon esculentum 

AF272737, Streptomyces griseus strain IFO13350 

AF372617, Citrus x paradisi 

AJ 133724, Mycobacterium aurum 

AJ224683, Narcissus pseudonarcissus 

D26095 and U38550, Anabaena sp. 

X89897, C.annuum 

Y15115, Anabaena PCC7210 crtQ gene 


crtA (spheroidene 
monooxygenase) 


AJ010302, Rhodobacter sphaeroides 

Z1 1 165 and X52291 , Rhodobacter capsulatus 


crtC 

(hydroxyneurospo 
rene synthase) 


AB034704, Rubrivivax gelatinosus 

AF1 95122 and AJ010302, Rhodobacter sphaeroides 

AF287480, Chlorobium tepidum 

U / Ov7*r*r, r\UUI ivivciA yciaui iuouo 

X52291 and Z11165, Rhodobacter capsulatus 
Z21955, M.xanthus 


crtD (carotenoid 
3,4-desaturase) 


AJ010302 and X63204, Rhodobacter sphaeroides 
U73944, Rubrivivax gelatinosus 
X52291and Z1 1 165, Rhodobacter capsulatus 


crtF 

(1-OH-carotenoid 
methylase) 


AB034704, Rubrivivax gelatinosus 
AF288602, Chloroflexus aurantiacus 
AJ010302, Rhodobacter sphaeroides 
X52291 and Z11165, Rhodobacter capsulatus 



The most preferred source of crt genes is from Pantoea stewartii. 
Sequences of these preferred genes are presented as the following SEQ 
ID numbers: the crtE gene (SEQ ID NO:1), the crtXgene (SEQ ID NO:3), 
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crtY (SEQ ID NO:5), the crtl gene (SEQ ID N0:7) t the crtB gene (SEQ ID 
N0;9) and the crtZ gene (SEQ ID NO: 11). 

By using various combinations of the genes presented in Table 2 
and the preferred genes of the present invention, innumerable different 

5 carotenoids and carotenoid derivatives could be made using the methods 
of the present invention, provided that sufficient sources of FPP are 
available in the host organism. For example, the gene cluster crtEXYIB 
enables the production of p-carotene. Addition of the crfZ to crtEXYIB 
enables the production of zeaxanthin. 

10 It is envisioned that useful products of the present invention will 

include any carotenoid compound as defined herein including, but not 
limited to antheraxanthin, adonixanthin, astaxanthin, canthaxanthin, 
capsorubrin, p-cryptoxanthin, didehydrolycopene, didehydrolycopene, p- 
carotene, ^carotene, 8-carotene, y-carotene, 

15 keto-^-carotene, v|/-carotene, e-carotene, p,\|/-carotene, torulene, 
echinenone, gamma-carotene, zeta-carotene, alpha-cryptoxanthin, 
diatoxanthin, 7,8-didehydroastaxanthin, fucoxanthin, fucoxanthinol, 
isorenieratene, p-isorenieratene lactucaxanthin, lutein, lycopene, 
neoxanthin, neurosporene, hydroxyneurosporene, peridinin, phytoene, 

20 rhodopin, rhodopin glucoside, siphonaxanthin, spheroidene, 

spheroidenone, spirilloxanthin, uriolide, uriolide acetate, violaxanthin, 
zeaxanthin-p-diglucoside, zeaxanthin, and C30-carotenoids. 
Methods for Optimizing the Carotenoid Biosvnthetic Pathway 

Metabolic engineering generally involves the introduction of new 

25 metabolic activities into the host organism or the improvement of existing 
processes by engineering changes such as adding, removing, or 
modifying genetic elements (Stephanopoulos, G., Metab. Eng., 1:1-11 
(1999)). One such modification is genetically engineering modulations to 
the expression of relevant genes in a metabolic pathway. 

30 There are a variety of ways to modulate gene expression. 

Microbial metabolic engineering generally involves the use of multi-copy 
vectors to express a gene of interest under the control of a constitutive or 
inducible promoter. This method of metabolic engineering for industrial 
use has several drawbacks. It is sometimes difficult to maintain the 

35 vectors due to segregational instability. Deleterious effects on cell viability 
and growth are often observed due to the vector burden. It is also difficult 
to control the optimal expression level of desired genes on a vector. To 
avoid the undesirable effects of using a multi-copy vector, a chromosomal 
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integration approach using homologous recombination via a single 
insertion of bacteriophage X, transposons, or other suitable vectors 
containing the gene of interest has been used. However, this method also 
has drawbacks such as the need for multiple cloning steps in order to get 
the gene of interest into a suitable vector prior to recombination. Another 
drawback is the instability associated with the inserted genes. which can 
be lost due to excision. Lastly, these methods have a limitation 
associated with the number of possible insertions and the inability to 
control the location of the insertion site on a chromosome. 

Several processes are involved in the regulation of gene 
expression. The main steps are (1) the initiation of transcription, (2) the 
termination of transcription, (3) the processing of transcripts, and (4) 
translation. Among these, the transcription initiation is a major step for 
controlling gene expression. The transcription initiation is determined by 
the sequence of the promoter region that includes a binding site for RNA 
polymerase together with possible binding sites for one or more 
transcription factors. 

Strong promoters are widely used for constitutive overexpression of 
key genes in a metabolic pathway. Strong and moderately strong 
promoters that are useful for expression in E. coli include lac, trp, APi_, 
XPr, 77, tec, T5 (P T s), and trc. A conventional way to regulate the amount 
and the timing of protein expression is to use an inducible promoter. An 
inducible promoter is not always active the way constitutive promoters are 
(e.g. viral promoters). Inducible promoters are normally activated in 
response to certain environmental or chemical stimuli (i.e. heat shock 
promoter, isopropyl-fJ-thiogalactopyranoside (IPTG) responsive promoters, 
and tetracycline (tet) responsive promoters, to name a few). 

Promoters of the stationary phase ctS regulon, which are active 
under stress conditions and at the onset of the stationary phase, control 
expression of about 100 genes involved in the protection of the cell 
against various stresses. The promoters of the aS regulon genes may 
also be useful for the expression of the desired genes when the 
metabolite products inhibit a cell growth. The oS-dependent stationary 
phase promoters includes /poS, bolA, appV, dps, cyxAB-appA, csgA, 
treA, osmB, katE, xthA, otsBA, glgS, osmY, pex, and mcc, to name a few. 

Termination control regions may also be derived from various 
genes native to the preferred hosts. Optionally, a termination site may be 
unnecessary, however, it is most preferred if included. 
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Alternatively, it may be necessary to reduce or eliminate the 
expression of certain genes in the target pathway or in competing 
pathways that may serve as competing sinks for energy or carbon. 
Methods of down-regulating genes for this purpose have been explored. 

5 Where the sequence of the gene to be disrupted is known, one of the 
most effective methods of gene down-regulation is targeted gene 
disruption, a process where foreign DNA is inserted into a structural gene 
so as to disrupt transcription. This can be effected by the creation of 
genetic cassettes comprising the DNA to be inserted (often a genetic 

10 marker) flanked by sequence having a high degree of homology to a 
portion of the gene to be disrupted. Introduction of the cassette into the 
host cell results in insertion of the foreign DNA into the structural gene via 
the native DNA replication mechanisms of the cell or by the X,-Red 
recombination system used in the present invention. (See for example 

15 Hamilton et al., J. Bacterid, 171:4617-4622 (1989); Balbas et aL, Gene, 
136:211-213 (1993); Gueldener etal., Nucleic Acids Res. , 24:2519-2524 
(1996); and Smith et al., Methods MoL Cell. S/o/., 5:270-277 (1996)) 

Antisense technology is another method of down regulating genes 
where the sequence of the target gene is known. To accomplish this, a 

20 nucleic acid segment from the desired gene is cloned and operably linked 
to a promoter such that the anti-sense strand of RNA will be transcribed. 
This construct is then introduced into the host cell and the antisense 
strand of RNA is produced. Antisense RNA inhibits gene expression by 
preventing the accumulation of mRNA which encodes the protein of 

25 interest. A person of skill in the art will know that special considerations 
are associated with the use of antisense technologies in order to reduce 
expression of particular genes. For example, the proper level of 
expression of antisense genes may require the use of different chimeric 
genes utilizing different regulatory elements known to the skilled artisan. 

30 Although targeted gene disruption and antisense technology offer 

effective means of down regulating genes where the sequence is known, 
other less specific methodologies have been developed that are not 
sequence based. For example, cells may be exposed to UV radiation and 
then screened for the desired phenotype. Mutagenesis with chemical 

35 agents is also effective for generating mutants and commonly used 
substances include chemicals that affect non-replicating DNA such as 
HNO2 and NH2OH, as well as agents that affect replicating DNA such as 
acridine dyes, notable for causing frame-shift mutations. Specific 

31 



WO 2004/056975 



PCT7US2003/041812 



methods for creating mutants using radiation or chemical agents are well 
documented in the art. See for example Thomas D. Brock in 
Biotechnology: A Textbook of Industrial Microbiology . Second Edition 
(1989) Sinauer Associates, Inc., Sunderland, MA., or Deshpande, Mukiind 

5 V., Appl. Biochem. Biotechnol., 36, 227, (1992). 

Another non-specific method of gene disruption is the use of 
transposable elements or transposons. Transposons are genetic 
elements that insert randomly into DNA but can be latter retrieved on the 
basis of sequence to determine where the insertion has occurred. Both 

10 in vivo and in vitro transposition methods are known. Both methods 
involve the use of a transposable element in combination with a 
transposase enzyme. When the transposable element or transposon is 
contacted with a nucleic acid fragment in the presence of the transposase, 
the transposable element will randomly insert into the nucleic acid 

15 fragment. The technique is useful for random mutageneis and for gene 
isolation, since the disrupted gene may be identified on the basis of the 
sequence of the transposable element. Kits for in vitro transposition are 
commercially available (see for example The Primer Island Transposition 
Kit, available from Perkin Elmer Applied Biosystems, Branchburg, NJ, 

20 based upon the yeast Ty1 element; The Genome Priming System, 
available from New England Biolabs, Beverly, MA; based upon the 
bacterial transposon Tn7; and the EZ::TN Transposon Insertion Systems, 
available from Epicentre Technologies, Madison, Wl, based upon the Tn5 
bacterial transposable element). Transposon-mediated random insertion 

25 in the chromosome can be used for isolating mutants for any number of 
applications including enhanced production of any number of desired 
products including enzymes or other proteins, amino acids, or small 
organic molecules including alcohols. 

The present invention has made use of this last method of pathway 

30 modulation to cause mutations in various essential genes to test whether 
there was any effect on the output of the carotenoid biosynthetic pathway. 
Transposon mutagenesis was used to create an £. coli mutant having a 
partial disruption in the yjeR gene. The precise sequence of the mutated 
gene is given as SEQ ID NO:63. This yjeR mutation {yjeR::Tn5 resulted in 

35 increased p-carotene production through an increase in plasmid copy 
number of the carotenoid producing plasmid (pPCB15 or pDCW108). The 
effect of mutation of this locus on plasmids is novel and could not have 
been predicted from known studies. Stacking the yjeR mutation 
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(yjeRr.TnS) into the engineered E. coli strains that were made by 
chromosomal engineering of a non-endogenous promoter upstream of 
isoprenoid genes and chromosomally integrating non-endogenous 
isoprenoid pathway genes allowed further increases of p-carotene 
5 production. 

The general methods described herein for pathway modulation are 
useful and enable the skilled person to practice the present invention. It 
will be appreciated that other, less traditional methods may be envisioned 
that will allow the practitioner to make the necessary modifications in the 
10 isoprenoid pathway. One such method involving chromosomal promoter 
replacement using a bacteriophage transduction system was used herein 
to good effect and is described below. 

Optimization of Carotenoid Production in E. coli bv Bacteriophage 
Transduction. 

15 The present method combines promoter replacement via 

homologous recombination (in a recombination proficient host) with a 
bacteriophage transducing system. The method allows for the rapid 
insertion of strong promoters upstream of desired elements for increased 
gene expression. The method also facilitates the production of libraries to 

20 assess which combinations of expressable genetic elements will optimize 
production of the desired genetic end product (Figure 12). In this way, 
genes not normally associated with a particular biosynthetic pathway may 
be identified which unexpectedly have significant effects on the production 
of the desired genetic end product. 

25 Integration Cassettes 

One aspect of the promoter replacement method is the use of an 
integration cassette. As used in the present invention, "integration 
cassettes" are the linear double-stranded DNA fragments chromosomally 
integrated by homologous recombination via the use of two PCR- 

30 generated fragments or one PCR-generated fragment as seen in Figure 2. 
The integration cassette comprises a nucleic acid integration fragment 
that contains an expressible DNA fragment and a selectable marker 
bounded by specific recombinase sites responsive to a site-specific 
recombinase, and homology arms having homology to different portions of 

35 the host cell's chromosome. Typically, the integration cassette will have 
the general structure: 5'-RR1-RS-SM-RS-Y-RR2-3' wherein 
(i) RR1 is a first homology arm ; 
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(ii) RS is a recombination site responsive to a site-specific 
recombinase; 

(iii) SM is a DNA fragment encoding a selectable marker; 

(iv) Y is a first expressible DNA fragment; and 
5 (v) RR2 is a second homology arm. 

Expressible DNA fragments of the invention are those that will be 
useful in genetically engineering biosynthetic pathways. For example, it 
may be useful to engineer a strong promoter in place of a native promoter 
in certain pathways. Virtually any promoter is suitable for the present 

10 invention including, but not limited to /ac, ara, fef, trp, AP/_, ZPr, T7 t tec, 
p J5f anc j trc (useful for expression in Escherichia coli) as well as the amy, 
apr, npr promoters and various phage promoters useful for expression in 
Bacillus, for example. 

Alternatively, different coding regions may be introduced 

15 downstream of existing native promoters. In this manner, new coding, 
regions comprising a biosynthetic pathway may be introduced that either 
complete or enhance a pathway already in existence in the host cell. 
These coding regions may be genes which retain their native promoters or 
may be chimeric genes operably linked to an inducible or constitutive 

20 strong promoter for increased expression of the genes in the targeted 
biosynthetic pathway. Preferred in the present invention are the genes of 
the isoprenoid/carotenoid biosynthetic pathway, which include dxs, dxr, 
ygbP, ychB, ygbB, idi, ispA, lytB, gcpE, ispB f gps, crtE, crtY, crtl, crtB, 
crtX, and crtZ, as defined above and illustrated in Figure 1. In the present 

25 invention, it is preferred if the expressible DNA fragment is a promoter or a 
coding region useful for modulation of a biosynthetic pathway. Exemplified 
in the present invention is the phage 75 strong promoter used for the 
modulation of the isoprenoid biosynthetic pathway in a recombinant 
proficient E. coli host. In some situations the expressible DNA fragment 

30 may be in antisense orientation where it is desired to down-regulate 
certain elements of the pathway. 

Generally, the preferred length of the homology arms is about 10 to 
about 100 base pairs in length. Given the relatively short lengths of the 
homology arms used in the present invention for homologous 

35 recombination, one would expect that the level of acceptable mismatched 
sequences should be kept to an absolute minimum for efficient 
recombination, preferably using sequences which are identical to those 
targeted for homologous recombination. From 20 to 40 base pairs of 
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homology, the efficiency of homologous recombination increases by four 
orders of magnitude (Yu et al. PNAS. 97:5978-5983. (2000)). Therefore, 
multiple mismatching within homology arms may decrease the efficiency 
of homologous recombination; however, one skilled in the art can easily 

5 ascertain the acceptable level of mismatching. 

The present invention makes use of a selectable marker on one of 
the two recombination elements (integration cassettes). Selectable 
markers are known in the art including, but are not limited to antibiotic 
resistance markers such as ampicillin, kanamycin, and tetracycline 

10 resistance. Selectable markers may also include amino acid biosynthesis 
enzymes (for selection of auxotrophs normally requiring the exogenously 
supplied amino acid of interest) and enzymes which catalyze visible 
changes in appearance such as p-galactosidase in iacr bacteria. As used 
herein, the markers are flanked by site-specific recombinase recognition 

15 sequences. After selection and construct verification, a site-specific 
recombinase is used to remove the marker. The steps of the present 
invention can then be repeated with additional in vivo chromosomal 
modifications. The integration cassette used to engineer the chromosomal 
modification includes a promoter and/or gene, and a selection marker 

20 flanked by site-specific recombinase sequences. Site-specific 

recombinases, such as the use of flippase (FLP) recombinase in the 
present invention, recognize specific recombination sequences (i.e. FRT 
sequences) and allow for the excision of the selectable marker. This 
aspect of the invention enables the repetitive use of the present process 

25 for multiple chromosomal modifications. The invention is not limited to the 
FLP-FRT recombinase system as several examples of site specific 
recombinases and their associated specific recognition sequences are 
know in the art. Examples of other suitable site-specific recombinases 
and their corresponding recognition sequences include: Cre-lox, R/RS, 

30 Gin/gix, Xer/d/f, Int/atf, a pSR1 system, a cer system, and a fim system. 
Recombination Proficient Host Cells 

The present invention makes use of a recombination proficient host 
cell that is able to mediate efficient homologous recombination between 
the integration cassettes and the host cell chromosome. Some organisms 

35 mediate homologous recombination very effectively (yeast for example) 
while others require genetic intervention. For example E. co//, a host 
generally considered as one which does not undergo efficient 
transformation via homologous recombination naturally, may be altered to 
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make it a recombination proficient host. Transformation with a helper 
plasmid containing the X-Red recombinase system increases the rate of 
homologous recombination several orders of magnitude (Murphy et al., 
Gene, 246:321-330 (2000); Murphy, K., J. Bacterid, 180:2063-2071; 

5 Poteete and Fenton, J. Bacterid., 182:2336-2340 (2000); Poteete, A,, 
FEMS Microbiology Lett., 201:9-14 (2001); Datsenko and Wanner, supra; 
Yu et al., supra] Chaveroche et al., Nucleic Acids Research, 28:e97:1-6 
(2000); US 6,355,412; US 6,509,156; and US SN 60/434602). The X-Red 
system can also be chromosomally integrated into the host. The X-Red 

10 system contains three genes (exo, bet, and gam) which change the 
normally recombination deficient E. coli into a recombination proficient 
host. 

Normally, E. coli efficiently degrades linear double stranded DNA 
via its RecBCD endonuclease, resulting in transformation efficiencies not 

15 useful for chromosomal engineering. The gam gene encodes for a protein 
that binds to the E.coli RecBCD complex, inhibiting endonuclease activity. 
The exo gene encodes for a X-exonuclease which processively degrades 
the 5' end strand of double stranded DNA and creates 3" single stranded 
overhangs. The protein encoded by bet complexes with the X- 

20 exonuclease and binds to the single-stranded DNA overhangs and 
promotes renaturation of complementary strands and is capable of 
mediating exchange reactions. The X-Red recombinase system enables 
the use of homologous recombination as a tool for in vivo chromosomal 
engineering in hosts, such as E. coli, normally considered difficult to 

25 transform by homologous recombination. The A,-Red system works in 
other bacteria as well (Poteete, A., supra, 2001). Use of the A,-Red 
recombinase system should be applicable to other hosts generally used 
for industrial production. These additional hosts include, but are not 
limited to Agrobacterium, Erythrobacter t Chlombium, Chromatium, 

30 Flavobacterium, Cytophaga, Rhodobacter, Rhodococcus, Streptomyces, 
Brevibacterium, Corynebacteria, Mycobacterium, Deinococcus, 
Paracoccus, Escherichia, Bacillus, Myxococcus, Salmonella, Yersinia, 
Enwinia, Pantoea, Pseudomonas, Sphingomonas, Methylomonas, 
Methylobacter, Methylococcus, Methylosinus, Methylomicrobium, 

35 Methylocystis, Alcaligenes, Synechocystis, Synechococcus, Anabaena, 
Thiobacillus, Methanobacterium, Klebsiella, and Myxococcus. Preferred 
hosts are selected from the group consisting of Escherichia, Bacillus, and 
Methylomonas. 
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X-Red Recombinase System 

The X-Red recombinase system used in the present invention is 
contained on a helper plasmid (pKD46) and is comprised of three 
essential genes, exo, bet, and gam (Datsenko and Wanner, supra;. The 
5 exo gene encodes an A,-exonuclease, which processively degrades the 5' 
end strand of double-stranded (ds) DNA and creates 3' single-stranded 
overhangs. Bet encodes for a protein which complexes with the X- 
exonuclease and binds to the single stranded DNA and promotes 
renaturation of complementary strands and is capable of mediating 
io exchange reactions. Gam encodes for a protein that binds to the E.colfs 
RecBCD complex and blocks the complex's endonuclease activity. 

The X-Red system is used in the present invention because 
homologous recombination in E.coli occurs at a very low frequency and 
usually requires extensive regions of homology. The k-Red system 
15 facilitates the ability to use short regions of homology (10-1 00 bp) flanking 
linear dsDNA fragments for homologous recombination. Additionally, the 
RecBCD complex normally expressed in E.coli prevents the use of linear 
dsDNA for transformation as the complex's exonuclease activity efficiently 
degrades linear dsDNA. Inhibition of the RecBCD complex's 
20 endonuclease activity by gam is essential for efficient homologous 
recombination using linear dsDNA fragments. 
Combinatorial P1 Transduction System 

Transduction is a phenomenon in which bacterial DNA is 
transferred from one bacterial cell (the donor) to another (the recipient) by 
25 a phage particle containing bacterial DNA. When a population of donor 
bacteria is infected with a phage, the events of the phage lytic cycle may 
be initiated. During lytic infection, the enzymes responsible for packaging 
viral DNA into the bacteriophage sometimes package host DNA. The 
resulting particle is called a transducing particle. Upon lysis of the cell, a 
30 mixture ("P1 lysate") of transducing particles and normal virions are 

released. When this lysate is used to infect a population of recipient cells, 
most of the cells become infected with normal virus. However, a small 
proportion of the population receives transducing particles that inject the 
DNA they received from the previous host bacterium. This DNA can 
35 undergo genetic recombination with the DNA of the other host. 

Conventional P1 transduction can move only one genetic trait (i.e. gene) 
at a time (donor to receipient cell). 
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It will be appreciated that a number of host systems may be used 
for purposes of the present invention including, but not limited to those 
with known transducing phages such as Agrobacterium, Erythrobacter, 
Chlombium, Chmmatium, Flavobacterium, Cytophaga, Rhodobacter, 
5 Rhodococcus, Streptomyces, Brevibacterium, Corynebacteria, 
Mycobacterium, Deinococcus, Paracoccus, Escherichia, Bacillus, 
Myxococcus, Salmonella, Yersinia, Erwinia, Pantoea, Pseudomonas, 
Sphingomonas, Methylomonas, Methylobacter, Methylococcus, 
Methylosinus, Methylomicrobium, Methylocystis, Alcaligenes, 
10 Synechocystis, Synechococcus, Anabaena, Thiobacillus, 

Methanobacterium, Klebsiella, and Myxococcus. Phages suitable for use 
in the present method may include, but are not limited to P1, P2, lambda, 
.+80, <i>3538, T1, T4, P22, P22 derivatives, ES18, Felix V, P1-CmCs, Ffm, 
PY20, Mx4, Mx8, PBS-1, PMB-1, and PBT-1. 
is The present method provides a system for moving multiple genetic 

traits into a single E. coli host in a parallel combinatorial fashion using the 
bacteriophage P1 mixtures in combination with the site-specific 
recombinase system for removal of selection markers (Figure 12). After 
P1 transduction with the P1 lysate mixture made from various donor cells, 
20 the transduced recipient cells are screened for antibiotic resistance and 
assayed for increased production of the desired genetic end product. 
After selection for the optimized transductants, the antibiotic resistance 
marker is removed by a site-specific recombinase. The selected 
transductants can be used again as a recipient cell in additional rounds of 
25 P1 transduction in order to engineer multiple chromosomal modifications, 
optimizing the production of the desired genetic end product. The present 
combinatorial P1 transduction method enables quick and easy 
chromosomal trait stacking for optimal production of the desired genetic 
end product. 

30 Using the method described above, the promoters of the key 

isoprenoid genes that encode for rate-limiting enzymes involved in the 
isoprenoid pathway were engineered. Replacement of the endogenous 
promoters with a strong promoter (P T5 ) resulted in increased p-carotene 
production. 

35 An advantage of the present method of promoter replacement is 

that it allows for multiple chromosomal modifications within the host cell. 
The system is a means for moving multiple genetic traits into a single host 
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cell using the bacteriophage P1 transduction in combination with a site- 
specific recombinase for removal of selection markers (Figures 2 and 12). 

The present combinatorial P1 transduction method for promoter 
replacement enabled isolation and identification of the ispB gene and its 

5 effect on increasing the production of p-carotene when placed under the 
control of the strong promoter. The effect of ispB on increasing the 
production of p-carotene was an unexpected and non-obvious result. IspB 
(octaprenyl diphosphate synthase), which synthesizes the precursor of the 
side chain of the isoprenoid quinones, drains away the FPP substrate 

10 from the carotenoid biosynthetic pathway (Figure 1). The mechanism of 
how overexpression of ispB gene under the control of phage 75 strong 
promoter increases the p-carotene production is not clear yet. However, 
the result suggests that IspB may increase the flux of the carotenoid 
biosynthetic pathway. Stacking the ispB gene under the control of a strong 

15 promoter into the chromosome of the engineered E. coli strains faciliated 
a further increase in p-carotene production (Figure 1 1). 
Measurement of the Carotenoid End Product 

If the desired genetic end product is a colored product then 
transformants can be selected for on the basis of colored colonies, and 

20 the product can be quantitated by UV/vis spectrometry at the product's 
characteristic A, max peaks. Alternative analytical methods can also be 
used including, but not limited to HPLC, CE, GC and GC-MS. • 

In the present invention, p-carotene was measured by UV/vis 
spectrometry at p-carotene's characteristic X max peaks at 425, 450 and 

25 478 nm. The carotenoid was extracted by acetone from the cell pellet. The 
host strain included a reporter plasmid for the expression of genes 
involved in the synthesis of p-carotene. The reporter plasmid (pPCB15 or 
pDCQ108) carried the Pantoea stewartii crtEXYIB gene cluster. The gene 
cluster facilitated the production of p-carotene. Therefore, an increase of 

30 carbon flux through the isoprenoid upper pathway will result in an increase 
in the amount of p-carotene produced; resulting in colonies with more 
intense color on agar plates when compared to the strain that does not 
have T5 promoters engineered upstream of the isoprenoid genes. The 
amount of carotenoid produced was measured by HPLC analysis. 

35 Detection of p-carotene was measured by absorption at 450 nm at its 
respective retention time using HPLC under particular solvent conditions. 
Quantitative analysis was carried out by comparing the peak area for p- 
carotene to a known p-carotene standard. 
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Descri ption of the Preferred E mbodiments 

E. coli has been genetically modified to create several strains 
capable of enhanced production of p-carotene. One of the strains has 
been shown to produce up to 6 mg p-carotene per gram of dry cell weight. 

5 Promoter replacement was accomplished using an easy one-step 

method of bacterial in vivo chromosomal engineering using two linear 
(PCR-generated) DNA fragments in order to increase carotenoid 
production in a host cell. The fragments were designed to contain short 
flanking regions of homology between the fragments and the target site on 

10 the host (£. coli) chromosome. The phage X-Red recombinase system 
was expressed on a helper plasmid and under control of an arabinose- 
inducible promoter for controllable and efficient in vivo triple homologous 
recombination between the two PCR-generated DNA fragments and the 
host cell's chromosome. At least one of the two linear double stranded 

15 (ds) DNA fragments used during recombination was designed to contain a 
selective marker (kanamycin) flanked by site-specific recombinase 
sequences (FR7}(Example 1). The selectable marker permitted the 
identification and selection of the cells that had undergone the desired 
recombination event. The constructs of the selected recombinants were 

20 verified by sequence analysis. The selective marker was excised by a 
second helper plasmid (pCP20) containing the site-specific recombinase 
gene under the control of the P R promoter of X phage (Examples 6-12 and 
17). 

A strong promoter (phage P T5 ) was placed upstream of the E.coli 
25 target genes dxs, /d/, ygbBygbP, ispB, ispAdxs (Example 1 ) via triple 
homologous recombination using two (PCR-generated) linear dsDNA 
fragments and the targeted chromosomal DNA (Figures 2). In each 
example, one of the two fragments contained a kanamycin resistance 
marker flanked by site-specific FRT recombinase sequences. Flanking 
30 the site-specific recombinase sequences were homology arms which 
contained short (approximately 10-50 bp) regions of homology. A first 
recombination region (homology arm #1) was linked to the 5'-end of the 
first fragment. A second recombination region (homology arm #2) was 
linked to the 3'-end of the first fragment. The second PGR generated 
35 linear dsDNA fragment contained the P75 strong promoter. The third 
recombination region (homology arm #3) was linked to the 3'-end of the 
second fragment. The first recombination region (homology arm #1) had 
homology to an upstream portion of the native bacterial chromosomal 
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promoter targeted for replacement. The second recombination region 
(homology arm #2 located on the 3'-end of the first fragment) had 
homology to the 5-end portion of the second fragment. The third 
recombination region (homology arm #3) had homology to a downstream 
5 portion of the native bacterial chromosomal promoter targeted for 
replacement (Figure 2). 

The recombination proficient E.coli host (containing the X-Red 
recombination system on the helper plasmid pKD46) was transformed with 
the two PCR-generated fragments resulting in the chromosomal 
io replacement of the targeted native promoter with the construct containing 
the kanamycin selectable marker of the first fragment and the P T5 strong 
promoter of the second fragment (Examples 1 and 6-12, Figure 2). The 
promoter replacement resulted in the formation of an augmented E.coli 
chromosomal gene (either dxs, kS % ygbBygbP, ispB or ispAdxs genes), 
15 operably linked to the introduced non-native promoter. The bacterial host 
cells that had undergone the desired recombination event were selected 
according to the expression of the selectable marker and their ability to 
grow in selected media. The selected recombinants were then 
transformed with a second helper plasmid, pCP20 (Cherepanov and 
20 Wackernagel, supra), expressing the flippase (Flp) site-specific 

recombinase which excised the selectable marker (Examples 6-12). The 
constructs were confirmed via PCR fragment analysis (Figures 3-5). The 
recombinant bacterial host cell containing the augmented isoprenoid 
genes (dxs, /d/, ygbBygbP, ispB or ispAdxs) and the carotenoid reporter 
25 plasmid (pPCB1 5) was then tested for increased production of p-carotene. 
Placement of one or more of the E. coli dxs, idi t ygbBygbP, ispB or 
ispAdxs genes (normally .expressed at very low levels) under control of the 
strong P T5 promoter resulted in significant increases in p-carotene 
production (Examples 18-19, Figure 11). 
30 In another embodiment, the method was used to simultaneously 

add a foreign gene and promoter. The first of the two PCR-generated 
fragments was designed so that it contained the fusion product of a 
selectable marker (kanamycin) and promoter (P T5 ) (Example 2, Figure 2)). 
The second PCR-generated fragment contained the fusion product of a 
35 selectable marker {kan-P T5 ) and the Methylomonas 16a dxs(16a) (SEQ ID 
NO:13), dxr(16a) (SEQ ID NO:17) or lytB(16a) (SEQ ID NO:15) genes 
(foreign to E. coli). Once again, homology arms were designed to allow 
for precise incorporation into the host bacterial chromosome. The desired 
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recombinants were selected by methods previously described. The 
selectable marker was then removed by a site-specific recombinase as 
previously described. The recombinant constructs were confirmed by 
PCR fragment analysis, p-carotene production in the transformed £ coli 

5 reporter strain was measured as previously described. Cells containing 
the Methylomonas 16a dxs(16a) and/or lytB(1Ga) genes (homologous to 
the £ coli dxs and lytB genes) under the control of the P T5 promoter 
exhibited an increase in p-carotene production (Figure 11). The present 
method was useful in the simultaneous addition of a foreign promoter and 

10 gene. Subsequent removal of the selectable marker is required so that 
the process can be repeated, if desired, to engineer bacterial biosynthetic 
pathways for increased production of the desired product. 

In another embodiment, the bacterial host strain was engineered to 
contain multiple chromosomal modifications, including multiple promoter 

15 and gene additions or replacements so that the production efficiency of 
the desired final product is increased. In a preferred embodiment, the 
incorporated or augmented chromosomal genes encode for enzymes 
useful for the production of carotenoids. 

In another preferred embodiment the constructs made by 

20 chromosomal engineering of non-endogenous promoters upstream of 
isoprenoid genes and chromosomally integrating non-endogenous 
isoprenoid pathway genes into the host chromosome are combined into a 
single strain. The phage 75 strong promoter (P T5 )-ispAdxs Pjs-idi, P T5 - 
ispAdxs P T5 -dxs(16a) % P T5 -ispAdxs P T5 -dxs(16a) P TS -lytB(16a), P T5 - 

25 ispAdxs P T5 -dxs(16a) P T5 -lytB(16a) P T5 -idi t P T5 -dxs P T5 -idi t P T5 -dxs P T5 - 
idi P T5 -ygbBygbP t P T ^dxs P T5 -idi P T5 -ygbBygbP P T5 -lytB(16a ), P T5 -dxs 
P T5 -idi Pjs-ygbBygbP yjeR::Tn5, and P T5 -dxs P T5 -idi P T5 -ygbBygbP P T5 - 
ispB were constructed by combinatorial stacking. Stacking of these 
constructs in a combinatorial manner facilitated the development of 

30 engineered host strains capable of significantly increased carotenoid 
production. 

In another embodiment, gene loci carrying transposon insertions 
that confer the ability to increase carotenoid production were engineered 
into the host chromosome. The £ coliyjeR gene carrying a Tn5 
35 transposon insertion sequence (yjeR::Tn5; SEQ ID NO:63) was stacked in 
combination with P r5 -dxs, P T5 -idi and P T5 -ygbBygbP to create a strain 
producing 19-fold higher levels of p-carotene (ATCC PTA-4807). 
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In another embodiment, an E coli reporter strain was constructed 
for assaying p-carotene production. Briefly, the reporter strain was 
created by cloning the gene cluster crtEXYIB from Pantoea stewartii into a 
reporter plasmid (pPCB15) that was subsequently used to transform the 

5 E.coli host (Figure 7). The cluster contained many of the genes required 
for the synthesis of carotenoids, producing p-carotene in the transformed 
E coli. It should be noted that the crtZ gene (p-carotene hydroxylase) 
was included in the gene cluster. However, since no promoter was 
present to express the crtZ gene (organized in opposite orientation and 

io adjacent to crtB gene), no zeaxanthin was produced. The zeaxanthin 
glucosyl transferase enzyme (encoded by the crtX gene located within the 
gene cluster) had no substrate for its reaction. Increases in p-carotene 
production were reported as increases relative to the control strain 
production (Figure 11). 

15 In another embodiment, a new reporter plasmid was created. 

Reporter plasmid pPCB15, used for many of the experiments, is 
considered a low copy number plasmid. A new medium-copy number 
reporter plasmid was generated, (pDCQ108) that also contained the 
Pantoea stewartii crtEXYIB gene cluster (Example 19). Plasmid pDCQ108 

20 was then used as the reporter plasmid in E.coli Pj^clxs Pj^icii P T5 - 
ygbBygbP P T5 -ispB leading to an approximately 30-fold increase in p- 
carotene production when compared to the control strain (Figure 11; 
Examples 20 and 21; Table 9)). 

It has been speculated that the limits for carotenoid production in 

25 non-carotenogenic host such as E coli had been reached at the level of 
around 1 .5 mg/g cell dry weight (1 ,500 ppm) due to overload of the 
membranes and blocking of membrane functionality (Albrecht et al., 
supra). The present method has solved the stated problem by making 
modifications on the E coli chromosome that resulted in increased p- 

30 carotene production of up to 6 mg per gram dry cell weight (6,000 ppm), 
an increase of 30-fold over initial levels with no lethal effect. The bacterial 
production of 6,000 ppm carotenoids is much higher than the maximum 
accepted limit (1,600 ppm) for carotneoid production in bacteria. 

One of skill in the art will recognize that the present method can be 

35 applied to a variety of hosts in addition to E coli. Use of the present 
method in other hosts is supported by the fact that: 1) the isoprenoid 
pathway is common in bacteria, 2) the X-Red system has been reported to 
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work in a variety of hosts, and 3) phage transduction is known to occur in 
many hosts. 

EXAMPLES 

The present invention is further defined in the following Examples. 

5 It should be understood that these Examples, while indicating preferred 
embodiments of the invention, are given by way of illustration only. From 
the above discussion and these Examples, one skilled in the art can 
ascertain the essential characteristics of this invention, and without 
departing from the spirit and scope thereof, can make various changes 

to and modifications of the invention to adapt it to various usages and 
conditions. 

GFNFRAL METHODS 

Standard recombinant DNA and molecular cloning techniques used 
in the Examples are well known in the art and are described by Sambrook, 
15 J., Fritsch, E. F. and Maniatis, T. Molecular Cloning: A Laboratory 
Manual ; Cold Spring Harbor Laboratory Press: Cold Spring Harbor, 
(1989) (Maniatis) and by T. J. Silhavy, M. L. Bennan, and L W. Enquist, 
Experimente with Gene Fusions. Cold Spring Harbor Laboratory, Cold 
Spring Harbor, NY (1984) and by Ausubel, F. M. et al., Current Protocols 
20 in Molecular Biology , pub. by Greene Publishing Assoc. and Wiley- 
Interscience (1987). 

Materials and methods suitable for the maintenance and growth of 
bacterial cultures are well known in the art. Techniques suitable for use in 
the following examples may be found as set out in Manual of Methods for 
25 funeral Bacteriology (Phillipp Gerhardt, R. G. E. Murray, Ralph N. 

Costilow, Eugene W. Nester, Willis A. Wood, Noel R. Krieg and G. Briggs 
Phillips, eds), American Society for Microbiology, Washington, DC. (1994)) 
or by Thomas D. Brock in Rintechnoloov A Textbook of Industrial 
Microbiology . Second Edition, Sinauer Associates, Inc., Sunderland, MA 
30 (1 989). All reagents, restriction enzymes and materials used for the 
growth and maintenance of bacterial cells were obtained from Aldrich 
Chemicals (Milwaukee, Wl), DIFCO Laboratories (Detroit, Ml), 
GIBCO/BRL (Gaithersburg, MD), or Sigma Chemical Company (St. Louis, 
MO) unless otherwise specified. 
35 Manipulations of genetic sequences were accomplished using the 

suite of programs available from the Genetics Computer Group Inc. 
(Wisconsin Package Version 9.0, Genetics Computer Group (GCG), 
Madison, Wl). Where the GCG program "Pileup" was used the gap 
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creation default value of 12, and the gap extension default value of 4 were 
used. Where the CGC "Gap" or "Bestfit" programs were used the default 
gap creation penalty of 50 and the default gap extension penalty of 3 were 
used. Multiple alignments were created using the FASTA program 

5 incorporating the Smith-Waterman algorithm (W. R. Pearson, Comput 
Methods Genome Res., [Proc. Int. Symp.] (1994), Meeting Date 1992, 
111-120. Editor(s): Suhai, Sandor. Publisher: Plenum, New York, NY). 
In any case where program parameters were not prompted for, in these or 
any other programs, default values were used. 

10 The meaning of abbreviations is as follows: "h" means hour(s), 

"min" means minute(s), "sec" means second(s), "d" means day(s), 
means microliter(s), "mL" means milliliter(s), "L" means liter(s), and "rpm" 
means revolutions per minute. 

EXAMPLE 1 

15 Construction of E. colt Strains with the phage Pj* Promoter 

Chromosomallv-inteqrated Upstream of the Isoprenoid Genes (Promoter 

Replacement) 

The native promoters of the £. coli isoprenoid genes cfxs, /eft 
ygbBygbP, ispB, and ispAdxs, (Figure 1) were replaced with the (P T5 ) 

20 promoter using two PCR-fragments chromosomal integration method as 
described in Figure 2. The method for replacement is based on 
homologous recombination via the X-Red recombinase encoded on a 
helper plasmid. Recombination occurs between the E. coli chromosome 
and two PGR fragments that contain 20-50 bp homology patches at both 

25 ends of PGR fragments (Figure 2). For integration of the P T5 promoter 
upstream of these genes, a two PCR fragment method was employed. In 
this method, the two linear fragments included a DNA fragment (1489 bp) 
containing a kanamycin selectable marker (kan) flanked by site-specific 
recombinase target sequences (FRT) and a DNA fragment (154 bp) 

30 containing a phage 75 promoter (P T5 ) comprising the -10 and -35 
consensus promoter sequences, lac operator (/acO), and a ribosomal 
binding site (rbs). 

By using the two PCR fragment method, the kanamycin selectable 
marker and P T5 promoter (/ca/?-P T5 ) were integrated upstream of the dxs, 

35 /eft ygbBP, /spB, and ispAdxs genes, yielding kan-P T5 -dxs, kan-P T5 -idi, 
kan-P T5 -ygbBP, kan-P T5 -ispB, and kan-P T ^ispAdxs. The linear DNA 
fragment (1489 bp) containing a kanamycin selectable marker was 
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synthesized by PCR from plasmid pKD4 (Datsenko and Wanner, supra) 
with primer pairs as follows in Table 3. 



TABLE 3 

Primers for Amplification of the Kanamvcin Selectable Marker 



Primer Name 


Primer Seauence 


SEQ ID 
NO: 


5'-kan(dxs) 


TGGAAGCGCTAGCGGACTACATCATCCA 


21 


GCGTAATAAATAACGTCTTGAGCGATTGT 


GTAG 1 


5'-kan(idi) 


TCTGATGCGCAAGCTGAAGAAAAATGAGC 


22 


ATGGAGAATAATATGACGTCTTGAGCGAT 


TGTGTAG 1 


5'- 

kan(ygbBP) 


GACGCGTCGAAGCGCGCACAGTCTGCGG 


23 


GGCAAAACAATCGATAACGTCTTGAGCGA 


TTGTGTAG 1 


5'- 

kan(ispAdxs) 


ACCATGACGGGGCGAAAAATATTGAGAG 


24 


TCAGACATTCATGTGTAGGCTGGAGCTGC 


TTC1 


3'-kan 


GAAGACGAAAGGGCCTCGTGATACGCCT 


25 


ATTTTTATAGGTTATATGAATATCCTCCTT 


AGTTCC 2 



1 The underlined sequences illustrate each respective homology arm chosen to match 
sequences in the upstream region of the chromosomal integration site, while the 
remainder is the priming sequence 

2 The underlined sequences illustrate homology arm chosen to match sequences in the 
5'-end region of the 75 promoter DNA fragment 



The second linear DNA fragment (154 bp) containing the P T5 
promoter was synthesized by PCR from pQE30 (QIAGEN, Inc. Valencia, 
CA) with primer pairs as follows in Table 4. 



TABLE 4 

Primers for Amplification of the P 7 5 Promoter 



Primer Name 


Primer Seauence 


SEQ ID 
NO: 


5'-T5 


CTAAGGAGGATATTCATATAACCTATAAAA 


26 


ATAGGCGTATCACGAGGCCC 1 


3'-T5(dxs) 


GGAGTCGACCAGTGCCAGGGTCGGGTATT 


27 


TGGCAATATCAAAACTCATAGTTAATTTCTC 


CTCTTTAATG 2 


3'-T5(idi) 


TGGGAACTCCCTGTGCATTCAATAAAATGA 


28 


CGTGTTCCGTTTGCATAGTTAATTTCTCCT 
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Primer Name 


Primer Seauence 


SEQ ID 

NO: 




CTTTAATG 2 




3'- 

T5(ygbBP) 


CGGCCGCCGGAACCACGGCGCAAACATC 


29 


CAAATGAGTGGTTGCCATAGTTAAl 1 IUlC 


CTCTTTAATG 2 


3'- 

T5(ispAdxs) 


CCTGCTTAACGCAGGCTTCGAGTTGCTGC 


30 


GGAAAGTCCATAGTTAATTTCTCCTCT 1 IA 


ATG 2 



10 



15 



' t llC UMUCMINGU OC^UOMV/W iiiu»uh»w .w.w 9/ — — — ' 

3'-end region of the kanamycin DNA fragment 

2 The underlined sequences illustrate each respective homology arm chosen to match 
sequences in the downstream region of the chromosomal integration site 

The linear DNA fragment (1 ,647 bp) containing fused kanamycin 
selectable marker-phage T5 promoter is synthesized by PGR from pSUH5 
with primer pairs as follows in Table 5. The pSUH5 plasmid (Figure 6; 
SEQ ID NO:66) was constructed by cloning a phage T5 promoter (P T5 ) 
region (SEQ ID NO:33) into the A/del restriction endonuclease site of 
pKD4 (Datsenko and Wanner, supra). 

TABLE 5 

Primers for Amplification of the Fused Kanamvcin Select able Marker- 
Phage P T 5 Promoter 



Primer Name 



5'- 

kanTS(ispB) 



3'- 

kanT5(ispB) 



Primer Seauence 



ACCATAAACCCTAAGTTGCCTTTGTTCACA 



GTAAGGTAATCGGGG CGTCTTGAGCGATT 

GTGTAG 1 

CGCCATATCTTGCGCGGTTAACTCATTGA 
TTTTTTCTAAATTCATA GTTAATTTCTCCTC 

TTTAATG 2 



SEQ ID 

NO: 



31 



32 



1 The underlined sequences illustrate each respective homology arm chosen to match 
' sequences in the upstream region of the chromosomal integration site. 

2 The underlined sequences illustrate each respective homology arm chosen to match 
20 sequences in the downstream region of the chromosomal integration site. 

Standard PCR conditions were used to amplify the linear DNA 
fragments with AmpliTaq Gold® polymerase (Applied Biosystems, Foster 
City, CA) as follows: 
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Stepl 94°C 3 min 
Step2 93°C 30 sec 
Step3 55°C 1 min 
Step4 72°C 3 min 



PCR reaction: 



StepS Go To Step2, 30 cycles 



Step6 72°C 5 min 
polymerase 



PCR reaction mixture: 
0.5 nL plasmid DNA 
5^L 10X PCR buffer 
1 \± dNTP mixture (10 mM) 
1 \± 5'-primer (20 |iM) 
1 \iL 3'-primer (20 nM) 
0.5 pL AmpliTag Gold® 



41 \iL sterilized dH 2 0 



10 



15 



20 



25 



30 



After completing the PCR reactions, 50 \± of each PCR reaction 
mixture was run on a 1% agarose gel and the PCR products were purified 
using the QIAquick Gel Extraction Kit™ as per the manufacturer's 
instructions (Cat. # 28704, QIAGEN Inc., Valencia, CA), The PCR 
products were eluted with 10 \xL of distilled water. The DNA Clean & 
Concentrator™ kit (Zymo Research, Orange, CA) was used to further 
purify the PCR product fragments as per the manufacturer's instructions. 
The PCR products were eluted with 6-8 nL of distilled water to a 
concentration of 0.5-1.0 \xgl\xL 

The £ coli MC1 061 strain, carrying the A,-Red recombinase 
expression plasmid pKD46 (amp R ) (SEQ ID NO:65) was used as a host 
strain for the chromosomal integration of the PCR fragments. The strain 
was constructed by transformation of E. coli strain MC1061 with the A,-Red 
recombinase expression plasmid, pKD46 (amp R ). Transformants were 
selected on 100 jug/mL ampicillin LB plates at 30°C. 

For transformation, electroporation was performed using 1-5 pg of 
the purified PCR products carrying the kanamycin marker and P75 
promoter. Approximately one-half of the cells transformed were spread on 
LB plates containing 25 jxg/mL kanamycin in order to select antibiotic- 
resistant transformants. After incubating the plate at 37°C overnight, 
antibiotic-resistance transformants were selected as follows: 10 colonies 
of kan-P T5 -dxs, 12 colonies of /can-P r5 -/c//, 10 colonies of kan-P TS -ygbBP, 
3 colonies of kan-P T ^ispB t and 19 colonies of kan-P T $-ispA. 

PCR analysis was used to confirm the integration of both the 
kanamycin selectable marker and the P7-5 promoter in the correct location 
on the £ coli chromosome. For PCR, a colony was resuspended in 50 ^L 
of PCR reaction mixture containing 200 .^M dNTPs, 2.5 U AmpliTag™ 
(Applied Biosytems), and 0.4 \M of specific primer pairs. Test primers 
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were chosen to match sequences of the regions located in the kanamycin 
(5'-primer) and the early coding-region of each isoprenoid gene (y-primer) 
(Figure 3). Sequences of these primers are listed in Tables 3, 4, and 5 
above and the PCR reaction was performed as described above. The 
5 resultant E coli strains carrying each /can-P r5 -isoprenoid gene fusion on 
the chromosome were used for stacking multiple /can-P T 5-isoprenoid gene 
fusions on the chromosome to construct E. coli strain for increasing p- 
carotene production as described in Examples 6-12 and 17. 

EXAMPLE 2 

10 r-nnstmction of F mil Strains wit h Mathvlomonas 1fiA dxs(16A), dxr(16A) 
and lvtB(16A) Genes Chrp ™ngnmallv-lntedrated 
Mef/jy/o/nonas 16a (ATCC PTA-2402) isoprenoid genes dxs, dxr 
and lytB (WO 02/20733 A2), with dxs (denoted as u dxs(16a)" and 
described as SEQ ID NO:13), dxr (denoted as "dxr(16a)" and described as 

15 SEQ ID NO:17), and lytB (denoted as u fytB(16a) n and described by SEQ 
ID NO:15), and the fused kan-P T5 promoter were co-integrated into the 
inter-operon regions located at 30.9, 78.6 and 18.1 min, respectively, of 
the £ coli chromosome using the two PCR-fragments chromosomal 
integration method as described in Figure 2. The principle for 

20 chromosomal integration of foreign gene is same as described in 
Example 1. 

The linear DNA fragment (1,647 bp) containing fused kanamycin 
selectable marker- P T5 promoter was synthesized by PCR from pSUH5 
with primer pairs as follows in Table 6. The pSUH5 plasmid (Figure 6) 
25 was constructed by cloning a P T5 promoter region (SEQ ID NO:33) into 
the A/del restriction endonuclease site of pKD4 (Datsenko and Wanner, 
supra). 



30 



TABLE 6 

Primers for Amplification of the Kanamvcin Selectable Marker- P T5 

Promoter 



Primer Name 


Primer Seauence 
r.ACTAACGCr,OGCACATTGCTGCGGGC 


SEQ ID 

NO: 


5'- 

kanT5(dxs16a) 


TTTTTGATTCATTTCGCACGTCTTGAGt; 
GATTGTGTAG 1 


34 


5'- 

kanT5(dxr16a) 


TAAAGGGCTAAGAGTAGTGTGCTCI I A 
GCCCTTAATTACGTTTCCCGTCTTGAGC 


35 
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Primer Name 


Primpr f^pniipnrfi 

I I 1 1 1 Iwl VyCLj Uul 1 


SEQ ID 

NO: 




\3t\ 1 Iblbl Ho ■ 




5'- 

kanT5(lytB16a) 


CTACAACTGGCGAGATGCATAGCGAGT 


36 


ATAATTTGTATTTTGCGTCGTCTTGAGC 


GATTGTGTAG 1 


3'- 

kanT5(dxs16a) 


AGTAGAGGGAAGTCTTTGGAAAGAGCC 


37 


ATAGTTAATTTCTCCTCTTTAATG 2 


3'- 

kanT5(dxr16a) 


ACGGTGCCGCCGCAATGATGCTGTCCA 


38 


CCAGTTAATTTCTCCTCTTTAATG 2 


3'- 

kanT5(lytB16a) 


CCACGGGGGTTTGCGAGTACGATTTGC 


39 


ATAGTTAATTTCTCCTCTTTAATG 2 



1 The underlined sequences illustrate each respective homology arm chosen to match 
sequences in the upstream region of the chromosomal integration site, while the 
remainder is the priming sequence • 

2 The underlined sequences illustrate homology arm chosen to match sequences in the 
5 5'-end region of the foreign gene DNA fragment 



The linear DNA fragment containing Methylomonas 16a dxs, dxr or 
lytB gene was synthesized by PCR from Methylomonas 16a (ATCC PTA- 
2402) genomic DNA with primer pairs as follows in Table 7. 

10 

TABLE 7 

Primers for Amplification of the Foreign Gene 



Primer Name 


Primer Seauence 


SEQ ID 

NO: 


5*-dxs16a 


ACAGAATTCATTAAAGAGGAGAAATTAACT 


40 


ATGGCTCTTTCCAAAGAC TTCCCTC 1 


5'-dxr16a 


ACAGAATTCATTAAAGAGGAGAAATTAACT 


41 


GGTGGACAGCATCATTGCGGCGGCA 1 


5'-lytB16a 


ACAGAATTCATTAAAGAGGAGAAATTAACT 


42 


ATGCAAATCGTACTCGCAAACCCCC 1 


3'-dxs16a 


AGGAGCGAAGTGATTATCAGTATGCTGTTC 


43 


ATATAGCCTCGAATTATCAAGCGCAAAACT 


GTTCGATG 2 


3'-dxr16a 


GGCATTTTCACTCTGGCAATGCGCATAAAC 


44 


GCTTTCAAAGTCCTGTTAAGCTACCAAGGT 


CTTGATG 2 


3"-lytB16a 


AGTGGCGGACGGGCAAACAAGGGTAACAT 


45 


AGGATCAATGAGGGTTATTGATCACGCTTG 


CAIAIGI I I 2 



1 The underlined sequences Illustrate homology arm chosen to match sequences in the 
15 3*-end region of the fused kanamycin-phage P7-5 promoter DNA fragment 

2 The underlined sequences illustrate each respective homology arm chosen to match 
sequences in the downstream region of the chromosomal integration site 
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The PCR reaction, purification and electro-transformation were 
performed as described in Example 1. Kanamycin-resistance 
transformants were selected including 7 colonies of E. coli kan-P T5 - 
dxs(16a), 3 colonies of E. colikan-P T5 <lxr(16a) and 12 colonies of E. coli 
5 kan-P T5 - lytB(16a). Among these, the colonies that have a correct 
integration of kan-P T!r dxs(16a), kan-P T!r dxr{16a) or kan-P T5r lytB(16a) 
into the target site of E. coli chromosome was selected by PCR analysis 
(Figure 3, 4, and 5). 

EXAMPLE 3 

10 canning of B-Carotene Prod uction Genes from Pantoea stewartii 

Primers were designed using the sequence from Erwinia uredovora 
to amplify a fragment by PCR containing the crt genes. These sequences 
included 5'-3': 

15 ATGACGGTCTGCGCAAAAAAACACG SEQ ID NO:19 

GAGAAATTATGTTGTGGATTTGGAATGC SEQ ID NO:20 

• 

Chromosomal DNA was purified from Pantoea stewartii (ATCC 
no. 8199) and Pfu Turbo polymerase (Stratagene, La Jolla, CA) was used 

20 in a PCR amplification reaction under the following conditions: 94°C, 
5 min; 94°C (1 min)-60°C (1 min)-72°C (10 min) for 25 cycles, and 72°C 
for 10 min. A single product of approximately 6.5 kb was observed 
following gel electrophoresis. Taq polymerase (Perkin Elmer, Foster City, 
CA) was used in a ten minute 72°C reaction to add additional 3' 

25 adenosine nucleotides to the fragment for TOPO cloning into pCR4-TOPO 
(Invitrogen, Carlsbad, CA) to create the plasmid pPCB13. Following 
transformation to E. coli DH5a (Life Technologies, Rockville, MD) by 
electroporation, several colonies appeared to be bright yellow in color 
indicating that they were producing a carotenoid compound. Following 

30 plasmid isolation as instructed by the manufacturer using the Qiagen 
(Valencia, CA) miniprep kit, the plasmid containing the 6.5 kb amplified 
fragment was transposed with pGPS1.1 using the GPS-1 Genome 
Priming System kit (New England Biolabs, Inc.. Beverly, MA). A number 
of these transposed plasmids were sequenced from each end of the 
35 transposon. Sequence was generated on an ABI Automatic sequencer 
using dye terminator technology (US 5,366,860; EP 272007) using 
transposon specific primers. Sequence assembly was performed with the 
Sequencher program (Gene Codes Corp., Ann Arbor Ml). 
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EXAMPLE 4 

Identification and Characterization of Bacterial Genes 
Genes encoding crtE, X, Y, /, B, and Z were identified by 
conducting BLAST (Basic Local Alignment Search Tool; Altschul, S. F., 

5 et al., J. Mol. Biol. 21 5:403-41 0 (1 993)) searches for similarity to 
sequences contained in the BLAST "nr n database (comprising all non- 
redundant GenBank® CDS translations, sequences derived from the 
3-dimensional structure Brookhaven Protein Data Bank, the SWISS- 
PROT protein sequence database, EMBL, and DDBJ databases). The 

10 sequences obtained in Example 3 were analyzed for similarity to all 

publicly available DNA sequences contained in the "nr" database using the 
BLASTN algorithm provided by the National Center for Biotechnology 
Information (NCBI). The DNA sequences were translated in all reading 
frames and compared for similarity to all publicly available protein 

15 sequences contained in the u nr" database using the BLASTX algorithm 
(Gish, W. and States, D., Nature Genetics, 3:266-272 (1993)) provided by 
the NCBI. 

All comparisons were done using either the BLASTNnr or 
BLASTXnr algorithm. The results of the BLAST comparison are given in 

20 Table 7 which summarize the sequences to which they have the most 
similarity. Table 7 displays data based on the BLASTXnr algorithm with 
values reported in expect values. The Expect value estimates the 
statistical significance of the match, specifying the number of matches, 
with a given score, that are expected in a search of a database of this size 

25 absolutely by chance. 
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EXAMPLE 5 

Analysis of Gene Function bv Transposon Mutagenesis 
Several plasmids carrying transposons which were inserted into 
each coding region including crfE, crtX t crtY, crtl, crtB, and c/tZwere 

5 chosen using sequence data generated in Example 3. These plasmid 
variants were transformed to £ coli MG1655 and grown in 100 mL Luria- 
Bertani broth in the presence of 100 jig/mL ampicillin. Cultures were 
grown for 18 hr at 26°C, and the cells were harvested by centrifugation. 
Carotenoids were extracted from the cell pellets using 10 mL of acetone. 

10 The acetone was dried under nitrogen and the carotenoids were 

resuspended in 1 mL of methanol for HPLC analysis. A Beckman System 
Gold® HPLC with Beckman Gold Nouveau Software (Columbia, MD) was 
used for the study. The crude extraction (0.1 mL) was loaded onto a 
125 x 4 mm RP8 (5 pm particles) column with corresponding guard 

15 column (Hewlett-Packard, San Fernando, CA). The flow rate was 
1 mL/min, while the solvent program used was: 0-1 1 .5 min 40% 
water/60% methanol; 11.5-20 min 100% methanol; 20-30 min 40% 
water/60% methanol. The spectrum data were collected by the Beckman 
photodiode array detector (model 168). % 

20 In the clone with wild type crtEXYIBZ, the carotenoid was found to 

have a retention time of 15.8 min and an absorption spectra of 450 nm, 
475 nm. This was the same value observed in comparison to the p- 
carotene standard. This suggested that crtZ gene organized in the 
opposite orientation was not expressed in this construct. The transposon 

25 insertion in crtZ had no effect as expected (data not shown). 

HPLC spectral analysis also revealed that a clone with transposon 
insertion in crtX also produced p-carotene. This is consistent with the 
proposed function of crtX encoding a zeaxanthin glucosyl transferase 
enzyme at a later step of the carotenoid pathway following synthesis of p- 

30 carotene. 

The transposon insertion in crtY did not produce p-carotene. The 
carotenoid's elution time (15.2 min) and absorption spectra (443 nm, 
469 nm, 500 nm) agree with those of the lycopene standard. 
Accumulation of lycopene in the crtY mutant confirmed the role crt Y as a 
35 lycopene cyclase encoding gene. 

The crtl extraction, when monitored at 286 nm, had a peak with 
retention time of 16.3 min and with absorption spectra of 276 nm, 286 nm, 
297 nm, which agrees with the reported spectrum for phytoene. Detection 
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10 



15 



of phytoene in the crtl mutant confirmed the function of the crtl gene as 
one encoding a phytoene dehydrogenase enzyme. 

The acetone extracted from the crtE mutant or crtB mutant was 
clear. Loss of pigmented carotenoids in these mutants indicated that both 
the crtE gene and crtB genes are essential for carotenoid synthesis. No 
carotenoid was observed in either mutant, which is consistent with the 
proposed function of crtB encoding a prephytoene pyrophosphate 
synthase and crtE encoding a geranylgeranyl pyrophosphate synthetase. 
Both enzymes are required for p-carotene synthesis. 

Results of the transposon mutagenesis experiments are shown 
below in Table 9. The site of transposon insertion into the gene cluster 
crtEXYIB is recorded, along with the color of the E. coli colonies observed 
on LB plates, the identity of the carotenoid compound (as determined by 
HPLC spectral analysis), and the experimentally assigned function of each 
gene. 

Table 9 

Trans poson Insertion Analysis of C arotenoid Gene Function 



Transposon 
insertion site 



Wild Type (with 
no transposon 
insertion) 



crtE 



crtB 



crtl 



crtY 



crtZ 



crtX 



Colony color 



Yellow 



White 



White 



White 



Pink 



Yellow 



Yellow 



Carotenoid 
observed by HPLC 



p-carotene 



Assigned gene function 



None 



None 



Phvtoene 



Lycopene 



B-carotene 



S-carotene 



Geranylgeranyl pyrophosphate 
synthetase 



Prephytoene pyrophosphate 
synthase 



Phvtoene dehydrogenase 



Lycopene cyclase 



p-carotene hydroxylase 

Zeaxanthin glucosyt transferase 



20 



25 



EXAMPLE 6 

nnnstmrtlon of E. coli P T Hs_QAdxs£T^i for '"creased fc 
p.arntene Production 
In order to characterize the effect of the chromosomal integration of 
the P T5 promoter in the front of the isoprenoid genes on p-carotene 
production, a strain (E. coli P T5 -ispAdxs P T5 -idi) containing a 
chromosomally integrated P T5 promoter upstream from ispAdxs and idi 
genes and capable of producing p-carotene was constructed. 
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First, P1 lysate of the E coli kan-P T5 -ispAdxs strain was prepared 
by infecting a growing culture of bacteria with the P1 phage and allowing 
the cells to lyse. For P1 infection, E coli kan-P T5 -ispAdxs strain was 
inoculated in 4 mL LB medium with 25 ^ig/mL kanamycin, grown at 37°C 
5 overnight, and then sub-cultured with 1:100 dilution of an overnight culture 
in 10 mL LB medium containing 5 mM CaCI 2 . After 20-30 min of growth at 
37°C, 10 7 P1 V ir phages were added. The cell-phage mixture was aerated 
for 2-3 h at 37°C until lysed, several drops of chloroform were added and 
the mixture vortexed for 30 sec and incubated for an additional 30 min at 
10 room temp. The mixture was then centrifuged for 10 min at 4500 rpm, 
and the supernatant transferred into a new tube to which several drops of 
chloroform were added. 

Second, P1 lysate made on E coli kan-P T5 -ispAdxs strain was 
transduced into the recipient strain, E coli MG1655 containing a p- 
15 carotene biosynthesis expression plasmid pPCB15 (cam R ) (Figure 6). 
The plasmid pPCB15 (cam R ) encodes the carotenoid biosynthesis gene 
cluster (crtEXYIB) from Pantoea Stewartii (ATCC no. 8199). The pPCB15 
plasmid was constructed from ligation of Smal digested pSU18 (Bartolome 
et al., Gene, 102:75-78 (1991)) vector with a blunt-ended PmeUNoft 
20 fragment carrying crtEXYIB from pPCB1 3 (Example 3). The E coli 

MG1655 pPCB15 recipient cells were grown to mid-log phase (1-2 x 10 8 
cells/ml) in 4 mL LB medium with 25 ^ig/mL chloramphenicol at 37°C. 
Cells were spun down for 10 min at 4500 rpm and resuspended in 2 mL of 
10 mM MgS0 4 and 5 mM CaCI 2 . Recipient cells (100 |aL) were mixed 
25 with 1 |xL, 10 yL, or 100 jaL of P1 lysate stock (10? pfu/^L) made from the 
E coli kan-Pjs-ispAdxs strain and incubated at 30°C for 30 min. The 
recipient cell-lysate mixture was spun down at 6500 rpm for 30 sec, 
resuspended in 100 \xL of LB medium with 10 mM of sodium citrate, and 
incubated at 37°C for 1 h. Cells were plated on LB plates containing both 
30 25 jig/mL kanamycin and 25 ^ig/mL chloramphenicol in order to select for 
antibiotic-resistant transductants and incubated at 37°C for 1 or 2 days. 
Six kanamycin-resistance transductants were selected. 

To eliminate kanamycin selectable marker from the chromosome, a 
FLP recombinase expression plasmid pCP20 (amp*) (ATCC PTA-4455) 
35 (Cherepanov and Wackernagel, supra), which has a temperature- 
sensitive replication of origin, was transiently transformed into one of the 
kanamycin-resistant transductants by electroporation. Cells were spread 
onto LB agar containing 100 yg/mL ampicillin and 25 ^g/mL 
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chloramphenicol LB plates, and grown at 30°C for 1 day. Colonies were 
picked and streaked on 25 \xglrr\L chloramphenicol LB plates without 
ampicillin antibiotics and incubated at 43°C overnight. Plasmid pCP20 
has a temperature sensitive origin of replication and was cured from the 

5 host cells by culturing cells at 43°C. The colonies were tested for 

ampicillin and kanamycin sensitivity to test loss of pCP20 and kanamycin 
selectable marker by streaking colonies on 100 jig/mL ampicillin LB plate 
or 25 ng/mL kanamycin LB plate. In this manner the E coli P T5 -ispAdxs 
strain was constructed 

io In order to further stack kan-P T5 -idi on chromosome of E coli P r5 - 

ispAdxs, P1 lysate made on E coli kan-P T5 -idi strain was transduced into 
the recipient strain, E. coli P T ^ispAdxs t as described above. 
Approximately 85 transductants were selected. After transduction, the 
kanamycin selectable marker was eliminated from the chromosome as 

15 described above, yielding E coli P T5 -ispAdxs Pj^-idi strain. 

For the E coli P T5 -ispAdxs Pj^idi strain, the correct integration of 
the P75 promoter in the front of ispAdxs and faff genes, and elimination of 
the kanamycin selectable marker from the E coli chromosome were 
confirmed by PCR analysis. A colony of the E coli P T5 -ispAdxs P T5 -idi 

20 strain was resuspendetf in 50 nL of PCR reaction mixture containing 200 
\M dNTPs, 2.5 U AmptoTacf™ (Applied Biosytems), and 0.4 \iM of different 
combination of specific primer pairs, T-kan (5'- 

ACCGGATATCACCACTTAT CTGCTC-3';SEQ ID NO:46) and B-ispA (5'- 
CCTAATAATGCGCCATACTGCATGG-3';SEQ ID NO:47), T-T5 (5'- 

25 TAACCTATAAAAATAGGCGTATCACGAGGCCC-3';SEQ ID NO:48) and 
B-ispA, T-kan and B-idi (5-CAGCCAACTGGAGAACGCGAGATGT- 
3';SEQ ID NO:49), T-T5 and B-idi. Test primers were chosen to amplify 
regions located either in the kanamycin marker or the P75 promoter and 
the early region of ispAdxs oridi gene (Figure 3). The PCR reaction was 

30 performed as described in Example 1 . The PCR results indicated the 
elimination of the kanamycin selectable marker from the E coli 
chromosome (Figure 3, lane 2 and 4). The chromosomal integration of 
the P T5 promoter fragment upstream of the ispAdxs and faff gene was 
confirmed based on the expected sizes of PCR products, 285 bp and 274 

35 bp, respectively (Figure 3, lane 1 and 3). 
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EXAMPLE 7 

Construction of E coli P^ispAdxs P r^dxsf16a) Strain for Increased B- 

Carotene Production 
In order to construct the E coli P T5 -ispAdxs P T5r dxs(16a) strain 

5 containing a chromosomally-integrated P T5 promoter upstream from 
ispAdxs genes and Methylomonas 16a dxs (dx$(16a))> P1 lysate made on 
E. colikan-P T5 -dxs(16a) strain was transduced into the recipient strain, 
E coli kan-Pjs-ispAdxs containing a p-carotene biosynthesis expression 
piasmid pPCB15 (cam R ), described in Example 3. Seventy-eight 

io kanamycin-resistance transductants were selected. The kanamycin 
selectable marker was eliminated from the chromosome of the 
transductants using a FLP recombinase expression system as described 
in Example 3 f yielding the E coli P T5 -ispAdxs P T5 -dxs(16a) strain. 
In the E coli P T5 -ispAdxs P T5 -dxs(16a) strain the correct 

15 integration of the phage T5 promoter in the front of ispAdxs genes and 
P T5 -dxs(16a) at inter-operon region located at 30.9 min on the E coli 
chromosome, and elimination of the kanamycin selectable marker were 
confirmed by PCR analysis. A colony of the E coli Pj^ispAdxs P TS - 
dxs(16a) strain was tested by PCR with different combination of specific 

20 primer pairs, T-kan and B-ispA, T-T5 and B-ispA, T-kan and B-dxs(16a) 
(5-GCGATATTGTATGTCTG ATTCAG G A-3' ; S EQ ID NO:50), T-T5 and B- 
dxs(16a). Test primers were chosen to amplify regions located either in 
the kanamycin resistance gene or the P7-5 promoter and the downstream 
region of the chromosomal integration site (Figure 3). The PCR reaction 

25 was performed as described in Example 1 . The PCR results indicated the 
elimination of the kanamycin selectable marker from the E coli 
chromosome (Figure 3, lane 6 and 8). The chromosomal integration of 
the P75 promoter fragment upstream of the ispAdxs gene and the 
integration of the P T5 -dxs(16a) gene at the inter-operon region was 

30 confirmed based on the expected sizes of PCR products, 285 bp and 
2184 bp, respectively (Figure 3, lane 5 and 7). 

EXAMPLE 8 

Construction of E coli P r rispAdxs P T *-dxs(16a) PrrlvtB(16a) Strain for 
Increased B-Carotene Production 
35 In order to create a bacterial strain capable of increased carotenoid 

production, the Methylomonas 16a lytB {lytB(16a)) gene under the control 
of a P75 promoter was further stacked into the E coli P T5 -ispAdxs P T5 - 
dxs(16a) strain by P1 transduction in combination with the FLP 
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recombination system. P1 lysate made on E. coli kan-P T5 -lytB(16a) strain 
was transduced into the recipient strain, E. coli kan-P T5 -ispAdxs /can-P T5 - 
dxs(16a) containing the (3-carotene biosynthesis expression plasmid 
pPCB1 5 (cam R ). Forty-two kanamycin-resistance transductants were 

5 selected. The kanamycin selectable marker was eliminated from the 
chromosome of the transductants using a FLP recombinase expression 
system as described in Example 6, yielding £ coli P T5 -i$pAdx$ P T5 - 
dxs(16a) P T5 -lytB(16a). 

For the E. coli P T5 -ispAdxs P T5 -dxs(16a) P T5 -lytB(16a) strain, the 

10 correct integration of the P T5 promoter upstream of ispAdxs genes and 
the addition of the P T5 -dxs(16a) and P T5 -lytB(16a) genes at inter-operon 
region located at 30.9 min and 18.1 min, respectively, on the E coli 
chromosome, and elimination of the kanamycin selectable marker were 
confirmed by PCR analysis. A colony of the E. coli Pj^ispAdxs P r5 - 

15 dxs(16a) Pj5-lytB(16a) strain was tested by PCR with different 

combination of specific primer pairs, T-kan and B-ispA, T-T5 and B-ispA, 
T-kan and B-dxs(16a), T-T5 and B-dxs(16a), T-kan and B-lytB(16a) (5*- 
TCCACTGGATGCGGGAAGCTGGCAG-3';SEQ ID NO:51), T-T5 and B- 
lytB(16a). Test primers were chosen to amplify regions located either in 

20 the kanamycin resistance gene or the P T5 promoter and the downstream 
region of the chromosomal integration site (Figure 3). The PCR reaction 
was performed as described in Example 1. The PCR results indicated the 
elimination of the kanamycin selectable marker from the E. coli 
chromosome (Figure 3, lane 10, 12 and 14). The chromosomal 

25 integration of the Pj$ promoter fragment upstream of the ispAdxs gene 
and integration of the P T5 -dxs(16a) and Pr5-lytB(16a) genes at the inter- 
operon region was confirmed based on the expected sizes of PCR 
products, 285 bp, 2184 bp, and 1282 bp, respectively (Figure 3, lane 9, 11 
and 13). 

30 EXAMPLE 9 

Construction of E. coli P T T ispAdxs P T ^dxs(16a) P T ^lvtB(16a) P rridi 
Strain for Increased B-Carotene Production 
In order to create a bacterial strain capable of increased carotenoid 
production, the Pj^idi gene was further stacked into the E. coli P T5 - 
35 ispAdxs P T5 -dxs(16a) P T5 -lytB(16a) strain by P1 transduction in 

combination with the FLP recombination system. P1 lysate made from 
E. coli kan-P T5 -idi strain was transduced into the recipient strain, E. coli 
kan-Pjs-ispAdxs kan-P T5 -dxs(16a) P T5 -lytB(16a) containing the p- 
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carotene biosynthesis expression plasmid pPCB15. Approximately 
450 kanamycin-resistance transductants were selected. The kanamycin 
selectable marker was eliminated from the chromosome of the 
transductants using a FLP recombinase expression system as described 
5 in Example 6, yielding E. coli P T5 -ispAdxs P TS -dxs(16a) P T5 -lytB(1 6a) 
P T5 -/d/. 

For the E. coli P T5 -i$pAdxs P T5 -dxs(16a) P T5 -lytB(16a) P T5 -idi 
strain, the correct integration of the P T5 promoter upstream of ispAdxs 
and idi genes and the integration of the P TS <ixs(16a) and P TS -lytB(16a) 

10 genes at inter-operon region located at 30.9 min and 18.1 min, 

respectively, on the E. coli chromosome, and elimination of the kanamycin 
selectable marker were confirmed by PCR analysis. A colony of the E 
coli P TS -ispAdxs P T5 -dxs(16a) P T5 -lytB(16a) P T5 -idi strain was tested by 
PCR with different combination of specific primer pairs, T-kan and B-ispA, 

15 T-T5 and B-ispA, T-kan and B-dxs(16a), T-T5 and B-dxs(16a), T-kan and 
B-lytB(16a), T-T5 and B-lytB(16a), T-kan and B-idi, T-T5 and B-idi. Test 
primers were chosen to amplify regions located either in the kanamycin 
resistance gene or the P T5 promoter and the downstream region of the 
chromosomal integration site (Figure 3). The PCR reaction was 

20 performed as described in Example 1. The PCR results indicated the 
elimination of the kanamycin selectable marker from the E. coli 
chromosome (Figure 4, lane 16, 18, 20, and 22). The chromosomal 
integration of the P T5 promoter fragment upstream of the ispAdxs and idi 
genes and the integration of the P T ^dxs(16a) and P T5r lytB(16a) 

25 constructs at the inter-operon region was confirmed based on the 

expected sizes of PCR products, 285 bp, 274 bp, 2184 bp, and 1282 bp, 
respectively (Figure 4, lane 15, 17, 19 and 21). 

EXAMPLE 10 

Construction of E. coli P j^dxs Pr ^idi Strain for Incre ased B-Carotene 
30 Production 

In order to characterize the effect of the chromosomal integration of 
P T5 strong promoter in the front of the dxs and idi genes on p-carotene 
production, E. coli P T ^dxs P T5 -idi, capable of producing p-carotene, was 
constructed. 

35 P1 lysate made with the E. coli kan-P T ^dxs strain was transduced 

into the recipient strain, E. coli MG1655 containing a p-carotene 
biosynthesis expression plasmid pPCB15 (cam R ) as described in Example 
6. Sixteen kanamycin-resistance transductants were selected. The 
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kanamycin selectable marker was eliminated from the chromosome of the 
transductants using a FLP recombinase expression system, yielding £. 
coli Pjs-dxs strain. 

In order to stack kan-P T5 -idi on chromosome of E coli Pjs-dxs, P1 

5 lysate made on E coli kan-Pjs-idi strain was transduced into the recipient 
strain, E coli Pjs-dxs, as described above. Approximately 450 
kanamycin-resistance transductants were selected. After transduction, 
the kanamycin selectable marker was eliminated from the chromosome as 
described above, yielding E coli P T5 -dxs P T §-idi strain. 

io For the E coli Pjs-dxs Pj^idi strain, the correct integration of the 

phage P75 promoter upstream of dxs and idi genes on the E coli 
chromosome, and elimination of the kanamycin selectable marker were 
confirmed by PCR analysis. A colony of the E coli Pj^dxs P T5 -idi strain 
was tested by PCR with different combination of specific primer pairs, 

15 T-kan and B-dxs (5-TGGCAACA GTCGTAGCTCCTGGGTGG-3';SEQ ID 
NO:52), T-T5 and B-dxs, T-kan and B-idi, T-T5 and B-idi. Test primers 
were chosen to amplify regions located either in the kanamycin or the P T5 
promoter and the downstream region of the chromosomal integration site 
(Figure 3). The PCR reaction was performed as described in Example 1 . 

20 The PCR results indicated the elimination of the kanamycin selectable 
marker from the E coli chromosome (Figure 4, lane 24 and 26). The 
chromosomal integration of the P T5 promoter fragment upstream of the 
dxs and idi gene was confirmed based on the expected sizes of PCR 
products, 229 bp and 274 bp, respectively (Figure 4, lane 23 and 25). 

25 EXAMPLE 11 

Construction of E coli P T *-dxs P T R-idi Pr ^vobBP Strain for Increased B- 

Carotene Production 
In order to create a bacterial strain capable of increased carotenoid 
production, P T5 -ygbBP gene was further stacked into the E coli PjQ-dxs 

30 P75-/C// strain by PI transduction in combination with the FLP 

recombination system. P1 lysate was with E coli kan-P T ^ygbBP strain 
was transduced into the recipient strain, E coli kan-P T rdxs kan-P T5r idi 
containing the p-carotene biosynthesis expression plasmid pPCB15 
(cam R ), as described above. Twenty-one kanamycin-resistance 

35 transductants were selected. The kanamycin selectable marker was 
eliminated from the chromosome of the transductants using a FLP 
recombinase expression system, yielding E coli Pjsrdxs P T5 -idi P 75 - 
ygbBP strain. 
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For the E coli P T5 -dx$ P T5 -idi Pj^ygbBP strain, the correct 
integration of the P T5 promoter upstream of dxs, idi and ygbBF; genes on 
the E. coli chromosome, and elimination of the kanamycin selectable 
marker were confirmed by PCR analysis. A colony of the E coli P T5 -dxs 
5 P T ridi Prs-ygbBP strain was tested by PCR with different combination of 
specific primer pairs, T-kan and B-dxs, T-T5 and B-dxs, T-kan and B~idi, 
T-T5 and B-idi, T-kan and B-ygb (5'- 

CCAGCAGCGCATGCACCGAGTGTTC-3')(SEQ ID N0.53), T-T5 and B- 
ygb. Test primers were chosen to amplify regions located either in the 

10 kanamycin resistance marker or the P T 5 promoter and the downstream 
region of the chromosomal integration site (Figure 3). The PCR reaction 
was performed as described in Example 1. The PCR results indicated the 
elimination of the kanamycin selectable marker from the E coli 
chromosome (Figure 4, lane 28, 30 and 32). The chromosomal 

15 integration of the P75 promoter fragment upstream of the dxs, idi and 
ygbBP gene was confirmed based on the expected sizes of PCR 
products, 229 bp, 274 bp, and 296 bp, respectively (Figure 4, lane 27, 29, 
and 31). 

EXAMPLE 12 

20 Construction of E coli P T *-DXS P r ^mRr^abBPEr^¥B(16a) Strain 

for Increased B-carotene Production 
In order to create a bacterial strain capable of increased carotenoid . 
production, the Methylomonas 16a lytB (lytB(16a)) gene under the control 
of a P T5 promoter was further stacked into the E coli P T5 -dxs Pj^idi P75- 

25 ygbBP strain by P1 transduction in combination with the FLP 

recombination system. P1 lysate made with E coli kan-P T5r lytB(16a) 
strain was transduced into the recipient strain, E coli kan-P T5r dxs kan- 
P T5 -idi Pjs-ygbBP containing the p-carotene biosynthesis expression 
plasmid pPCB15 (cam R ), described previously. Approximately 300 

30 kanamycin-resistance transductants were selected. The kanamycin 
selectable marker was eliminated from the chromosome of the 
transductants using a FLP recombinase expression system, yielding 
E coli P T5 -dxs P T5 -idi P T5 -ygbBP P T5 -lytB(16a) strain. 

For the E coli P T5 -dxs P T5 -idi Pj^ygbBP P TS -¥B(16a) strain, the 

35 correct integration of the P T5 promoter upstream of dxs, idi and ygbBP 
genes and integration of the P T5 -lytB(16a) gene at inter-operon region 
located at 18.1 min on the E. coli chromosome, and elimination of the 
kanamycin selectable marker were confirmed by PCR analysis. A colony 
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of the E coli P T5 -dxs P TS 4di P^ygbBP P T5 -lytB(16a) strain was tested 
by PCR with different combination of specific primer pairs, T-kan and B- 
dxs, T-T5 and B-dxs t T-kan and B-idi, T-T5 and B-idi, T-kan and B-ygb, T- 
T5 and B-ygb, T-kan and B-lytB(16a), T-T5 and B-lytB(16a). Test primers 
5 were chosen to amplify regions located either in the kanamycin resistance 
marker or the phage P T5 promoter and the downstream region of the 
chromosomal integration site (Figure 3). The PCR reaction was 
performed as described in Example 1 The PCR results indicated the 
elimination of the kanamycin selectable marker from the E. coli 
io chromosome (Figure 4, lane 34, 36, 38 and 40). The chromosomal 
integration of the P T5 promoter fragment upstream of the dxs, idi and 
ygbBP gene and the integration of P T5 -lytB(16a) gene was confirmed 
based on the expected sizes of PCR products, 229 bp, 274 bp, 296 bp, 
and 1282 bp, respectively (Figure 4, lane 33, 35, 37, and 39). 
15 EXAMPLE 13 

Isolation of Chromosomal Mutations that Increase Caro tenoid Production 

Wild type E coli is non-carotenogenic and synthesizes only the 
farnesyl pyrophosphate precursor for carotenoids. When the crtEXYIB 
gene cluster from Pantoea stewartiiwas introduced into E coli, p-carotene 
20 was synthesized and the cells exhibit a yellow color characteristic of p- 
carotene. E.'co// chromosomal mutations which increase carotenoid 
production should result in colonies that have are more intensely 
pigmented or deeper yellow in color (Figure 8). 

The plasmid pPCB15 (cam R ) encodes the carotenoid biosynthesis 
25 gene cluster (crtEXYIB) from Pantoea Stewartii (ATCC no. 8199). The 
pPCB15 plasmid was constructed from ligation of Smal digested pSU18 
(Bartolomeet al., Gene, 102:75-78 (1991)) vector with a blunt-ended 
PmeUNott fragment carrying crtEXYIB from pPCB13 (Example 3). E coli 
MG1655 transformed with pPCB15 was used for transposon mutagenesis. 
30 Mutagenesis was performed using EZ:TN™ <KAN-2>Tnp 

Transposome™ kit (Epicentre Technologies, Madison, Wl) according to 
manufacture's instructions. The transposon (1 \xL) was electroporated into 
50 nL of highly electro-competent MG1655 (pPCB15) cells. The mutant 
cells were spread onto LB-Noble Agar (Difco laboratories, Detroit, Ml) 
35 plates with 25 n.g/mL kanamycin and 25 ng/mL chloramphenicol, and 
grown at 37°C overnight. Tens of thousands of mutant colonies were 
visually examined for production of increased levels of p-carotene as 
evaluated by deeper yellow color development. The candidate mutants 

64 



WO 2004/056975 



PCT/US2003/041812 



were re-streaked to fresh LB-Noble Agar plates and glycerol frozen stocks 
made for further characterization. 

EXAMPLE 14 
Quantitation of Carotenoid Production 

5 To confirm that the mutants selected for increased production p- 

carotene by visually screening for deeper yellow colonies in Example 13 
indeed produced more p-carotene, the carotenoids were extracted from 
cultures grown from each mutant strain and quantified 
spectrophotometrically. Each candidate mutant strain was cultured in 10 

10 mL LB medium with 25 ^ig/mL chloramphenicol in 50 mL flasks overnight 
shaking at 250 rpm. MG1655 (pPCB15) was used as the control. 
Carotenoids were extracted from each cell, pellet for 15 min into 1 mL 
acetone, and the amount of p-carotene produced was measured at 
455 nm. Cell density was measured at 600 nm. The ratio OD455/OD600 

15 was used to normalize p-carotene production for different cultures, p- 

carotene production was also verified by HPLC. Among the mutant clones 
tested, eight showed increased p-carotene production (Figure 9). Mutant 
Y15 showed almost two-fold increase in p-carotene production as shown 
in Figure 8 which represents the averages of three independent 

20 measurements with standard deviations calculated and indicated as 
standard deviation bars. 

EXAMPLE 15 

Mapping of the Transposon Insertions on the E. coli Chromosome 
The transposon insertion site in each mutant was identified by PCR 

25 and sequencing directly from chromosomal DNA of the mutant strains. A 
modified single-primer PCR method (Karlyshev et al., BioTechniques, 
28:1078-82, 2000) was used. For this method, a 100 \± volume of 
overnight culture was heated at 99°C for 10 min in a PCR machine. Cell 
debris was removed by centrifugation at 4000 g for 10 min. A 1 \iL 

30 volume of supernatant was used in a 50 \iL PCR reaction using either 
TnSPCRF (S'-GCTGAGTTGAAGGATCAGATC-^SEQ ID NO:54) or 
Tn5PCRR (ff-CGAGCAAGACGTTTCCCGTTG-^SEQ ID NO:55) primer. 
PCR was carried out as follows: 5 min at 95°C; 20 cycles of 92°C for 30 
sec, 60°C for 30 sec, 72°C for 3 min; 30 cycles of 92°C for 30 sec, 40°C 

35 for 30 sec, 72°C for 2 min; 30 cycles of 92°C for 30 sec, 60°C for 30 sec, 
72°C for 2 min. A 10-^iL volume of each PCR product was 
electrophoresed on an agarose gel to evaluate product length. A 40 yL 
volume of each PCR product was purified using the Qiagen PCR cleanup 
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kit, and sequenced using sequencing primers Kah-2 FP-1 (5- 
ACCTACAACAAAGCTCTCATCAACC-3';SEQ ID NO:56) or Kan-2 RP-1 
(S-GCAATGTAACATCAGAGATTTTGAG-S'iSEQ ID NO:57) provided by 
the EZ:TN™ <KAN-2>Tnp Transposome™ kit. The chromosomal 

5 insertion site of the transposon was identified as the junction between the 
Tn5 transposon and MG1655 chromosome DNA by aligning the sequence 
obtained from each mutant with the £ coli MG1655 genomic sequence. 
Mutant Y15 carried a Tn5 insertion in yjeR (Ghosh, S M PNAS, 96:4372- 
4377 (1999)). The Tn5 cassette was located very close to the carboxy 

io terminal end of the gene (Figure 10) and most likely resulted in functional 
although truncated protein product. 

EXAMPLE 16 

Confirmation of transposon insertions in £ coli chromosome 
To confirm the transposon insertion sites in Example 15, 

15 chromosome specific primers were designed 400-800bp upstream and 
downstream from the transposon insertion site for each mutant. Primers 
Y15_F (S'-GGATCGATCTTGAGATGACC-^SEQ ID NO:58) and Y15_R 
(5^GCmCGTAATTTTCGCATTTCTG-3';SEQ ID NO:59) were used to 
screen the Y15 mutant. Three sets of PCR reactions were performed for 

20 each mutant. The first set (named as PCR 1) uses a chromosome 

specific upstream primer with a chromosome specific downstream primer. 
The second set (PCR 2) uses a chromosome specific upstream primer 
with a transposon specific primer (either Kan-2 FP-1 or Kan-2 RP-1, 
depending on the orientation of the transposon in the chromosome). The 

25 third set (PCR 3) uses a chromosome specific downstream primer with a 
transposon specific primer. PCR conditions are: 5 min at 95°C; 30 cycles 
of 92°C for 30 sec, 55°C for 30 sec, 72°C for 1 min; then 5 min at 72°C. 
Wild type MG1655 (pPCB15) cells served as control cells. For the control 
cells, the expected wild type bands were detected in PCR1, and no mutant 

30 band was detected in PCR2 or PCR3. For all the eight mutants, no wild 
type bands were detected in PCR1, and the expected mutant bands were 
detected in both PCR2 and PCR3. The size of the products in PCR2 and 
PCR3 correlated well with the insertion sites in each specific gene. 
Therefore, the mutants contained the transposon insertions as indicated in 

35 Example 15. 
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EXAMPLE 17 

Construction of E. coli P T5 -dxs P TT idi Pr^vabBP vieR::Tn5 Strain for 
Increased B-Carotene Production 
In order to create a bacterial strain capable of increased carotenoid 

5 production, a gene, yjeR::Tn5 (SEQ ID NO:63) partially knocked-out by 
transposon (Tn5) (/can*) as discovered by experiments outlined in 
Examples 13-16, was further stacked into the E. coli P T5 -dxs P T5 -idi P T5 - 
ygbBP strain by P1 transduction. The yjeR gene encoding 
oligoribonuclease that has a 3'-to-5' exoribonuclease activity for small 

10 oligoribonucleotides has been isolated by random transposon (Tn5)- 
insertional mutagenesis for increasing p-carotene production. P1 lysate 
made on E. coliyjeR::Tn5 strain was transduced into the recipient strain, 
E. coli Pjs-dxs Pjs-idi Pj$-ygbBP containing the p-carotene biosynthesis 
expression plasmid pPCB15 (cam R ), described previously. Six 

15 kanamycin-resistance transductants were selected. 

For the E. coli P r5 -dxs P T ?idi P T5 -ygbBP yjeR::Tn5 strain, the 
correct integration of the P T5 promoter upstream of dxs, idi and ygbBP 
genes and integration of the yjeR:: Tn5 gene on the E. coli chromosome 
was confirmed by PCR fragment analysis. A colony of the E. coli Prsrdxs 

20 Pjs-idi Pjs-ygbBP yjeR::Tn5 strain was tested by PCR with different 

combination of specific primer pairs, T-kan and B-dxs, T-T5 and B-dxs, T- 
kan and B-idi, T-T5 and B-idi, T-kan and B-ygb, T-T5 and B-ygb, T- 
TnSyjeR (S'-GCAATGTAACATCAGAGATTTTGAG-S'; SEQ ID IMO:60) 
and B-yjeR (5 , -GCTTTCGTAATTTTCGCATTTCTG-3 , ; SEQ ID NO:61). 

25 Test primers were chosen to amplify regions located either in the 

kanamycin selection marker or the Pj 5 promoter and the downstream 
region of the chromosomal integration site (Figure 3). The PCR reaction 
was performed as described in Example 1. The PCR results indicated the 
elimination of the kanamycin selectable marker from the E. coli 

30 chromosome (Figure 4, lane 42, 44, and 46). The chromosomal 
integration of the P T5 promoter fragment upstream of the dxs, idi and 
ygbBP genes and the integration of the transposon (Tn5) into yjeR gene 
[yjeR::Tn5) was confirmed based on the expected sizes of PCR products, 
229 bp, 274 bp, 296 bp, and 285 bp, respectively (Figure 4, lane 41, 43, 

35 45, and 47). 



67 



WO 2004/056975 



PCIYUS2003/041812 



EXAMPLE 18 

Construction of E coli P r *-dxs Pj^idi P r T vabBP P rx-ispB Strain for 
Increased B-Carotene Production 
The E coli P T5 -dxs P T5 -idi P T5 -ygbBP P T s-ispB strain was 

5 constructed by P1 transduction in the combination of the Flp site-specific 
recombinase for marker removal. P1 lysate made from E coli kan-P T5 ~ 
ispB strain was transduced into the recipient strain, E. coli Pr5- dxs PT5r idl 
P T5 -ygbBP containing the p-carotene biosynthesis expression plasmid 
pPCB15 (cam R ). Thirty-six kanamycin-resistance transductants were 

10 selected. A kanamycin selectable marker was eliminated from the 

chromosome as described at Example 6, yielding E coli Pjs-dxs Pj^idi 
P T5 -ygbBP P T5 -ispB. 

The stacking of ispB gene under the control of the P T5 strong 
promoter resulted in unexpected increase of p-carotene production. This 

15 was a non-obvious result because IspB (octaprenyl diphosphate 

synthase), which supplies the precursor of the side chain of the isoprenoid 
quinones, drains away the FPP precursor from the carptenoid biosynthetic 
pathway (Figure 1). The mechanism of how overexpression of ispB gene 
under the control of P T5 promoter increases the p-carotene production is 

20 not clear yet. However, the result suggests that IspB may increase the 
flux of the carotenoid biosynthetic pathway. 

For the E. coli P T5 -dxs P T5 -idi P T5 -ygbBP Pjs-ispB strain the 
correct integration of the phage Pj 5 promoter in the front of dxs, idi, 
5 ygbBP, and ispB genes, and elimination of the kanamycin selectable' 

25 marker were confirmed by PCR analysis. A colony of the E. coli Pjs-dxs 
P T5 -idi Pjs-ygbBP Pj^ispB was tested by PCR with different combination 
of specific primer pairs, T-T5 and B-dxs, T-kan and B-dxs, T-T5 and B-idi, 
T-kan and B-idi, T-T5 and B-ygb, T-kan and B-ygb, T-T5 and B-ispB (5'- 
AGTACAGCAATCATCGGACGAATACG-3'; SEQ ID NO:62), and T-kan 

30 and B-ispB. Test primers were chosen to amplify regions located either in 
the kanamycin selectable marker or the P T5 promoter and the 
downstream region of the chromosomal integration site (Figure 3). The 
PCR reaction was performed as described in Example 1 . The PCR 
results indicated the elimination of the kanamycin selectable marker from 

35 the E coli chromosome (Figure 5, lane 49, 51, 53, and 55). The 

chromosomal integration of the P T5 promoter upstream of the dxs, idi, 
ygbBP and ispB genes was confirmed based on the expected sizes of 
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PCR products, 229 bp, 274 bp, 296 bp, and 318 bp, respectively 
(Figure 5, lane 48, 50, 52, and 54). 

EXAMPLE 19 

Transformation of pDCQ108 into E, coliP j ^xs P T R-idi P T T vabBP 

5 P jq-IsdB Strain 

The low copy number plasmid pPCB1 5 (containing the p-carotene 
synthesis genes Pantoea crtEXYIB) used as a reporter plasmid for 
monitoring p-carotene production in E. coli P T5 -dxs P T5 -idi P T5 -ygbBP 
P T5 -ispB was replaced with the medium copy number plasmid pDCQ108 

10 (ATCC PTA-4823) containing p-carotene synthesis genes Pantoea 

crtEXYIB. The plasmid pPCB15 was eliminated form the E. coli Prs-dxs 
Pjs-idi Pjs-ygbBP P T5 -ispB strain by streaking on LB plate, incubating at 
37 °C for 2 d, and picking up a white-colored colony. 

The plasmid pDCQ108 (tet R ) was transformed into E. coli Pj^dxs 

15 Pjs-idi Pjs-ygbBP P T5 -ispB strain (white colony lacking a carotenoid 
reporter plasmid). Electro-transformation was performed as described in 
Example 1. Transformants were selected on 25 ng/mL of tetracycline LB 
plates at 37°C. The resultant transformants were the E. coli Pjs-dxs P7-5- 
idi Pjs-ygbBP P T5 -ispB strain carrying pDCQ108. 

20 EXAMPLE 20 

Measurement of B-Carotene Production in E. coli Strains with 

Chromosomal Integrations 
p-carotene production of the 9 chromosomally engineered E. coli 
strains, E. coli pPCB 15 P T5 -ispAdxs Pj^idi, E. co//pPCB15 P T5 -ispAdxs 

25 P T5 -dxs(16a), E. coli pPCB1 5 P T5 -ispAdxs P T5 -dxs(16a) P T5 -lytB(16a), 
E. coli pPCB15 P T5 -ispAdxs P T5 -dxs(16a) P T ^lytB(16a) Pjs-idi, E. coli 
pPCB1 5 P75-C/XS Pj5-idi> E. coli pPCB1 5 P T5 <lxs P T5 -idi P T ^ygbBP t 
E. coli pPCB1 5 P T5 -dxs P T5 -idi Pj^ygbBP P T5 -lytB(16a), E. coli pPCB15 
P T5 -dxs Pjs-idi Pjs-ygbBP yjeR::Tn5, and E. co//pDCQ108 Pjs-dxs P T5 - 

30 idi Pjs-ygbBP P T5 -ispB was quantified by the following 

spectrophotometry method. The quantitative analysis of p-carotene 
production was achieved by measuring the spectra of p-carotene's 
characteristic A, max peaks at 425, 450 and 478 nm. The 8 chromosomally- 
engineered E. coli control strains were grown in 5 mL LB containing 

35 25 ng/mL of chloramphenicol at 37°C for 24 h, and then harvested by 
centrifugation at 4000 rpm for 10 min. The p-carotene pigment was 
extracted by resuspending cell pellet in 1 mLof acetone with vortexing for 
1 min and then rocking the sample for 1 h at room temperature. Following 
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centrifugation at 4000 rpm for 10 min, the absorption spectrum of the 
acetone layer containing p-carotene was measured at 450 nm using an 
Ultrospec 3000 spectrophotometer (Amersham Biosciences, Piscataway, 
NJ). The production of p-carotene in E. coli pPCB15 P T5 -ispAdxs Pj^idi 
5 and E. coli pPCB1 5 P T5 -ispAdxs P T ^dxs(16a) was approximately 3.5-fold 
and 4.3-fold higher than that of the control strain, E. coli pPCB15, 
respectively (Figure 11). Additional stacking of P T5 -lytB(16a) and Pj^-idi 
in E. coli pPCB1 5 P T5 -ispAdxs P T5 <ixs(16a) P T ^lytB(16a) and E. coli 
pPCB15 P T5 -ispAdxs P T5 -dxs(16a) P T ^lytB(16a) Pj^idi didn't increase 

10 the production of p-carotene significantly. The production of p-carotene in 
E. coli pPCB15 Prs-dxs Pj^idi was approximately 4.4-fold higher than 
that of the E. coli pPCB1 5 control strain. Additional stacking of P T5 -ygbBP 
and P T5 -lytB(16a) in E. co//pPCB15 P T5 -dxs P T5 -idi P T5 -ygbBP and 
E. coli pPCB1 5 P T5 -dxs P T5 -idi Pj^ygbBP increased production of p- 

15 carotene 41 % and 45 %, respectively compared to that of E. coli pPCB1 5 
Pjs-dxs Pjs-idi (Figure 1 1). The production of p-carotene in the E. coli 
pPCB15 P T5 -dxs P T5 -idi P T5 -ygbBP yjeR::Tn5, was approximately 19-fold 
higher than that of the E. coli pPCB15 control strain. The E. coli pDCQ108 
P T5 -dxs P T5r idi P T5 -ygbBP P T5 -ispB strain showed the best titer of p- 

20 carotene production, approximately 30-fold higher than the E. coli pPCB1 5 
control strain. 

EXAMPLE 21 

Determination of B-Carotene Content in E. coliP T5 -dxs PT5-i_ d LPr5^nhBR 
vieR::Tn5 and E. coli P T *-dxs P rridi P jrvabBP P rr ispB 

25 Example 20 demonstrated that the E. coli pPCB1 5 P T 5~dxs P T5 -idi 

PjpygbBP yjeR::Tn5 (ATCC PTA-4807) and E. coli pDCQ108 P r5 -dxs 
P T5r idi Pj^ygbBP P T5 -ispB (ATCC PTA-4823) strains in this invention 
produces high levels of p-carotene, showing deep orange colored colony 
on LB plate. The content of p-carotene in the E. coli pPCB15 P r5 -dxs 

30 P T5 -idi P T5 -ygbBP yjeR::Tn5 and E. coli pDCQ1 08 P T5 -dxs P TS -idi P T5 - 
ygbBP Pfs-ispB strains also was quantified by HPLC analysis. The E. coli 
pPCB15 control, E. coli pCPB15 P T5 -dxs P T5 -idi P T5 -ygbBP yjeR::Tn5 
and E. coli pDCQ1 08 P T5 -dxs P T5 -idi P T5 -ygbBP P m ispB strains were 
grown in 50 mL LB containing 25 ^ig/mL of chloramphenicol at 37°C for 

35 24 h with 250 rpm agitation. Twenty mL of the culture cells was filtered on 
37 mm diameter cellulose filter (0.2 |nm) (Millipore, Bedford, MA) that was 
pre-weighted after drying at 95 °C oven for 24 h. After washing with 
1 0 mL of sterile water, the cells on the pre-weighted filter were completely 
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dried at 95 <>C oven for 24 h until its weight did not change. The dry cell 
weight was determined by subtracting the weight of filter itself from the 
total weight. 

Twenty mL of the culture cells was harvested by centrifugation at 
5 4000 rpm for 10 min for carotenoid extraction and analysis. The p- 
carotene pigment was extracted as described in Example 20. The 
carotene extract obtained was analyzed for the p-carotene content by a 
high performance liquid chromatography (HPLC). A 125 x 4 mm RP8 
(5 jum particles) column (Hewlett-Packard, San Fernando, CA) was used 

10 for HPLC analysis of p-carotene. The flow rate was 1 mLVmin and the 
solvent program was as follows: 0 - 11.5 min linear gradient from 40% 
water/60% methanol to 100% methanol, 11.5-20 min 100% methanol, 
20-30 min 40% water/60% methanol. Detection of p-carotene was 
measured by absorption at 450 nm and quantitative analysis was carried 

15 out by comparing an area of the peak of p-carotene to a known p- 
carotene standard (Sigma, Saint Louis, MO). 

£ coli pPCB1 5 P T5 -dxs P TS -idi P T5 -ygbBP yjeR::Tn5 and E. coli 
pDCQ108 P T5 -dxs Pjs-idi P T5 -ygbBP Pj^ispB strains produced 3.8 mg 
of p-carotene per gram of dry cell weight (3,800 ppm) and 6.0 mg of p- 

20 carotene /g of dry cell weight (6,000 ppm) p-carotene, respectively, while 
E coli pPCB15 control strain produces 0.2 mg of p-carotene/g of dry cell 
weight (200 ppm) (Table 10). The HPLC analysis for the p-carotene 
content also showed that the chromosomally engineered E. coli pPCB15 
P T5 -dxs P T5 -idi P T5 -ygbBP yjeR::Tn5 and E coli pDCQ1 08 P T(r dxs P T5 - 

25 idi Pjs-ygbBP P T5 -ispB strains produced p-carotene 1 9-fold and 30-fold 
higher than the control strain, respectively. 

It has been speculated that the limits for carotenoid production in 
non-carotenogenic host such as E. coli had been reached at the level of 
around 1 .5 mg/g cell dry weight (1 ,500 ppm) due to overload of the 

30 membranes and blocking of membrane functionality (Albrecht et al M 
supra). The present method has solved the stated problem by making 
modifications to the E. coli chromosome allowing p-carotene production of 
6 mg per g dry weight (6,000 ppm), an increase of 30-fold over initial 
levels in E. coli pDCQ108 P r5 -dxs P T5 -idi P T5 -ygbBP P T5 -ispB. 
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TABLE 10 
B-carotene Production 



Strain 


P-Carotene (mg/g dew 1 ) 


E. co// MG 16 55 pPCB152 


0.2 


E co//MG1655pPCB15 2 

Prs-dxs P TS -idi P T5 -ygbBP yjeR::Tn5 


3.8 


E. coli MG1655 pDCQ108 3 
P r5 -dxs Pjs-idi P T5 -ygbBP P TS -ispB 

' 1 r>— — 


6.0 



^Dry Cell Weight ~ ' 

2 pPCB15 contains the carotenoid biosynthesis gene cluster (crtEXYIB) from Pantoea 
Stewartii (ATCC no. 81 99). 

3 pDCQ108 contains the carotenoid biosynthesis gene cluster (crtEXYIB) from Pantoea 
Stewartii (ATCC no. 8199). 
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CLAIMS 

What is claimed is: 

1 . A carotenoid overproducing bacteria comprising the genes 
encoding a functional carotenoid enzymatic biosynthetic pathway wherein 

5 the dxs, idi and ygbBP genes are overexpressed and wherein the yjeR 
gene is down regulated. 

2. A carotenoid overproducing bacteria comprising the genes 
encoding a functional carotenoid enzymatic biosynthetic pathway wherein 
the dxs, idi, ygbBP and ispB genes are overexpressed. 

10 3. The carotenoid overproducing bacteria of Claim 1 or 2 wherein 

the lytB and dxr gene is optionally overexpressed. 
ispB lytB and dxr yjeR 

4. The carotenoid overproducing bacteria of Claim 1 or 2 wherein 
the carotenoid enzymatic biosynthetic pathway consists of the genes dxs, 

15 dxr, ygpP, ychB, ygbB, lytB, idi, ispA, ispB crtE, crtB, crtl, and crtY. 

5. The carotenoid overproducing bacteria of Claim 4 wherein the 
carotenoid enzymatic biosynthetic pathway optionally additionally 
comprises the crtZ and crtW genes. 

6. The carotenoid overproducing bacteria of any of Claims 1-5 

20 wherein the bacteria is selected from the group consisting Agrobacterium, 
Erythrobacter, Chlorobium, Chromatium, Flavobacierium, Cytophaga, 
Rhodobacter, Rhodococcus, Streptomyces, Brevibacterium, 
Corynebacteria, Mycobacterium, Deinococcus, Paracoccus, Escherichia, 
Bacillus, Myxococcus, Salmonella, Yersinia, Erwinia, Pantoea, 

25 Pseudomonas, Sphingomonas, Methylomonas, Methylobacter, 
Methylococcus, Methylosinus, Methylomicrobium, Methylocystis, 
Alcaligenes, Synechocystis, Synechococcus, Anabaena, Thiobacillus, 
Methanobacterium, Klebsiella, and Myxococcus. 

7. The carotenoid overproducing bacteria of Claim 6 wherein the 
30 bactera is E. colL 

8. The carotenoid overproducing bacteria of Claims 1-3 wherein 
the dxs, dxr, ygpP, ychB, ygbB, lytB, idi, ispA, ispB are derived from a 
Methylomonas sp.. 

9. The carotenoid overproducing bacteria of any of Claims 1 - 3 
35 wherein the dxs, idi, ispB and ygbBP genes are under the control of a 

strong promoter. 
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10. The carotenoid overproducing bacteria of Claim 9 wherein the 
strong promoter is selected from the group consisting of lac, ara, tet, trp 
APb APR. T7, tac, P T5 , and trc. ' ' 

11 The carotenoid overproducing bacteria of any of Claims 1-3 
5 wherein the dxs, idi, ispB and ygbBP genes are integrated in multicopy in 
the bacterial chromosome. 

12. The carotenoid overproducing bacteria of any of Claims 1 -3 
wherein the dxs, idi, ispB and ygbBP genes are present in multicopy in the 
bacteria on one or more plasmids. 
io 1 3. The carotenoid overproducing bacteria of of Claim 7 wherein 

the yjeR gene is down regulated by gene disruption. 

14. The carotenoid overproducing bacteria of Claim 1 3 wherein the 
disrupted yjeR gene has the nucleotide sequence as set forth in SEQ ID 
NO:63. 

15 15. The carotenoid overproducing bacteria of either of any of 

Claims 1 -3 wherein the dxs, idi, ispB ygbBP and lytB genes are 
chromosomally integrated into the host cell genome. 

16. A carotenoid overproducing bacteria selected from the group 
consisting of: a strain having the ATCC identification number PTA-4807 

20 and a strain having the ATCC identification number PTA-4823. 

1 7. A method for the production of a carotenoid comprising: 

a) growing the carotenoid overproducing bacteria of any of 
Claims 1 -5, the bacteria overexpressing at least one 
gene selected from the group consisting of dxs, idi ygbBP, 

25 ispB, lytB, dxr, wherein yjeR is optionally downregulated, 

for a time sufficient to produce a carotenoid; and 

b) optionally recovering the carotenoid from the carotenoid 
overproducing bacteria of step (a). 

1 8. A method according to Claim 1 7 wherein the carotenoid is 
30 selected from the group consisting of antheraxanthin, adonixanthin, 

astaxanthin, canthaxanthin, capsorubrin, p-cryptoxanthin, 
didehydrolycopene, didehydrolycopene, p-carotene, 
^-carotene, 6-carotene. y-carotene, 

keto-7-carotene, y-carotene, e-carotene, p,\|/-carotene, torulene, 
is echinenone, gamma-carotene, zeta-carotene, alpha-cryptoxanthin, 
diatoxanthin, 7,8-didehydroastaxanthin, fucoxanthin, fucoxanthinol, 
isorenieratene, p-isorenieratene lactucaxanthin, lutein, lycopene, ' 
neoxanthin, neurosporene, hydroxyneurosporene, peridinin, phytoene 
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rhodopin, rhodopin glucoside, siphonaxanthin, spheroidene, 
spheroidenone, spirilloxanthin, uriolide, uriolide acetate, violaxanthin, 
zeaxanthin-p-diglucoside, zeaxanthin, and C30-carotenoids. 

19. A method according to Claim 18 wherein the carotenoid is 
produced at a level of at least about 6 mg per gram dry cell weight. 

20. A method according to Claim 18 wherein the bacteria is 
selected from the group consisting Agrobacterium, Erythrobacter, 
Chlorvbium, Chromatium, Flavobacterium,. Cytophaga, Rhodobacter, 
Rhodococcus, Streptomyces, Brevibacterium, Corynebacteria, 
Mycobacterium, Deinococcus, Paracoccus, Escherichia, Bacillus, 
Myxococcus, Salmonella, Yersinia, Erwinia, Pantoea, Pseudomonas, 
Sphingomonas, Methylomonas, Methylobacter, Methylococcus, 
Methylosinus, Methylomicrobium, Methylocystis, Alcaligenes, 
Synechocystis, Synechococcus, Anabaena, Thiobacillus, 
Methanobacterium, Klebsiella, and Myxococcus. 

21 . A method according to Claim 20 wherein the bacteria is £ coli. 

22. A method according to Claim 1 7 wherein the dxs, idi, ygbBP, 
ispB and lytB genes are under the control of a promoter selected from the 
group consisting of lac, ara, tet, trp, AP L , ZP Ri J7 y tac, P T5 , and trc. 

23. A method according to Claim 1 7 wherein the dxs, idi, ispB, 
ygbBP and lytB genes are integrated in multicopy in the bacterial 
chromosome. 

24. A method according to Claim 17 wherein the dxs, idi, ispB, 
ygbBP and lytB genes are in multicopy in the bacteria on one or more 
plasmids. 

25. A method according to Claim 17 wherein the yjeR gene is down 
regulated by gene disruption. 

26. A method according to Claim 25 wherein the disrupted yjeR 
gene has the nucleotide sequence as set forth in SEQ ID NO:63. 

27. A method according to Claim 17 wherein the dxs, idi ispB, 
ygbBP and lytB genes are chromosomally integrated into the host cell 
genome. 
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SEQUENCE LISTING 

<110> E. I. duPont de Nemours and company, inc; 

<120> Increasing Carotenoid Production in Bacteria Via Chromosomal 
integration 

<130> CL2027 PCT 

<150> US 60/434618 

<151> 2002-12-19 

<160> 66 

<170> Patentln version 3-2 

<210> 1 

<211> 912 

<212> DNA 

<213> Pantoea stewartii 
<220> 

<221> misc_feature 

<222> CD..C3) . 

<223> ttg alternative start codon used to encode methionine 



<400> 1 
ttgacggtct 


gcgcaaaaaa 


acacgttcac 


cttactggca 


tttcggctga gcagttgctg 


60 


gctgatatcg 


atagccgcct 


tgatcagtta 


ctgccggttc 


agggtgagcg ggattgtgtg 


120 


ggtgccgcga 


tgcgtgaagg 


cacgctggca 


ccgggcaaac 


gtattcgtcc gatgctgctg 


180 


ttattaacag 


cgcgcgatct 


tggctgtgcg 


atcagtcacg 


ggggattact ggatttagcc 


240 


tgcgcggttg 


aaatggtgca 


tgctgcctcg 


ctgattctgg 


atgatatgcc ctgcatggac 


300 


gatgcgcaga 


tgcgtcgggg 


gcgtcccacc 


attcacacgc 


agtacggtga acatgtggcg 


360 


attctggcgg 


cggtcgcttt 


actcagcaaa 


gcgtttgggg 


tgattgccga ggctgaaggt 


420 


ctgacgccga 


tagccaaaac 


tcgcgcggtg 


tcggagctgt 


ccactgcgat tggcatgcag 


480 


ggtctggttc 


agggccagtt 


taaggacctc 


tcggaaggcg 


ataaaccccg cagcgccgat 


540 


gccatactgc 


taaccaatca 


gtttaaaacc 


agcacgctgt. 


tttgcgcgtc aacgcaaatg 


600 


gcgtccattg 


cggccaacgc 


gtcctgcgaa 


gcgcgtgaga 


acctgcatcg tttctcgctc 


660 


gatctcggcc 


aggcctttca 


gttgcttgac 


gatcttaccg 


atggcatgac cgataccggc 


720 


aaagacatca 


atcaggatgc 


aggtaaatca 


acgctggtca 


atttattagg ctcaggcgcg 


780 


gtcgaagaac 


gcctgcgaca 


gcatttgcgc 


ctggccagtg 


aacacctttc cgcggcatgc 


840 


caaaacggcc 


attccaccac 


ccaacttttt 


attcaggcct 


ggtttgacaa aaaactcgct 


900 


gccgtcagtt 


aa 
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<210> 2 
<211> 303 
<212> PRT 

<213> Pantoea stewartii 
<400> 2 

Met Thr val cys Ala Lys Lys His val His Leu Thr Gly He Ser Ala 
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1 5 10 15 

Glu Gin Leu Leu Ala Asp lie Asp Ser Arg Leu Asp Gin Leu Leu Pro 
20 25 " 30 

val Gin Gly Glu Arg Asp cys Val Gly Ala Ala Met Arg Glu Gly Thr 
35 40 45 

Leu Ala Pro Gly Lys Arg lie Arg pro Met Leu Leu Leu Leu Thr Ala 
50 55 60 

Arg Asp Leu Gly Cys Ala lie ser His Gly, Gly Leu Leu Asp Leu Ala 
65 70 75 80 

Cys Ala val Glu Met val His Ala Ala Ser Leu lie Leu Asp Asp Met 
85 90 95 

Pro cys Met Asp Asp Ala Gin Met Arg Arg Gly Arg Pro Thr lie His 
100 105 110 

Thr Gin Tyr Gly Glu His Val Ala lie Leu Ala Ala Val Ala Leu Leu 
115 120 f 125 

Ser Lys Ala Phe Gly Val lie Ala Glu Ala Glu Gly Leu Thr Pro lie 
130 135 140 

Ala Lys Thr Arg Ala val Ser Glu Leu Ser Thr Ala lie Gly Met Gin 
145 150 155 160 

Gly Leu Val Gin Gly Gin Phe Lys Asp Leu ser Glu Gly Asp Lys Pro 
165 170 175 

Arg Ser Ala Asp Ala lie Leu Leu Thr Asn Gin Phe Lys Thr ser Thr 
180 185 190 

Leu Phe Cys Ala Ser Thr Gin Met Ala Ser lie Ala Ala Asn Ala ser 
195 200 205 

cys Glu Ala Arg Glu Asn Leu His Arg Phe ser Leu Asp Leu Gly Gin 
210 215 220 r 

Ala Phe Gin Leu Leu Asp Asp Leu Thr Asp Gly Met Thr Asp Thr Gly 
225 230 235 240 

Lys Asp lie Asn Gin Asp Ala Gly Lys ser Thr Leu Val Asn Leu Leu 
245 250 255 

Gly ser Gly Ala Val Glu Glu Arg Leu Arg Gin His Leu Arg Leu Ala 
260 265 270 

ser Glu His Leu ser Ala Ala cys Gin Asn Gly His Ser Thr Thr Gin 
275 280 285 
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Leu Phe He Gin Ala Trp Phe Asp Lys Lys Leu Ala Ala Val Ser 
290 295 300 



<210> 3 
<211> 1296 



<212> DNA 
<213> Pantoea stewartii 



<220> 

<221> CDS 

<222> (1) . . (1296) 



<400> 3 

atg age cat ttt gcg gtg ate gca ccg ccc ttt ttc age cat gtt cgc 48 
Met ser His Phe Ala val lie Ala Pro Pro Phe Phe Ser His Val Arg 
1 5 10 15 

get ctg caa aac ctt get cag gaa tta gtg gee cgc ggt cat cgt gtt 96 
Ala Leu Gin Asn Leu Ala Gin Glu Leu VaT Ala Arg Gly His Arg val 
20 25 30 

acg ttt ttt cag caa cat gac tgc aaa gcg ctg gta acg ggc age gat 144 
Thr Phe Phe Gin Gin His Asp Cys Lys Ala Leu Val Thr Gly Ser Asp 
35 40 45 

ate gga ttc cag acc gtc gga ctg caa acg cat cct ccc ggt tec tta 192 
lie Gly Phe Gin Thr Val Gly Leu Gin Thr His Pro Pro GTy Ser Leu 
50 55 60 

teg cac ctg ctg cac ctg gee gcg cac cca etc gga ccc teg atg tta 240 
Ser His Leu Leu His Leu Ala Ala His Pro Leu Gly Pro ser Met Leu 
65 70 75 80 

cga ctg ate aat gaa atg gca cgt acc age gat atg ctt tgc egg gaa 288 
Arg Leu He Asn Glu Met Ala Arg Thr ser Asp Met Leu Cys Arg Glu 
85 90 95 

ctg ccc gec get ttt cat gcg ttg cag ata gag ggc gtg ate gtt gat 336 
Leu Pro Ala Ala Phe His Ala Leu Gin lie Glu Gly val lie Val Asp 
100 105 110 

caa atg gag ccg gca ggt gca gta gtc gca gaa gcg tea ggt ctg ccg 384 
Gin Met Glu Pro Ala Gly Ala val Val Ala Glu Ala ser Gly Leu Pro 
115 120 125 

ttt gtt teg gtg gee tgc gcg ctg ccg etc aac cgc gaa ccg ggt ttg 432 
Phe val ser val Ala Cys Ala Leu pro Leu Asn Arg Glu Pro Gly Leu 
130 135 140 

cct ctg gcg gtg atg cct ttc gag tac ggc acc age gat gcg get egg 480 
Pro Leu Ala Val Met Pro Phe Glu Tyr Gly Thr ser Asp Ala Ala Arg 
145 150 155 160 

gaa cgc tat acc acc age gaa aaa att tat gac tgg ctg atg cga cgt 528 
Glu Arg Tyr Thr Thr Ser Glu Lys lie Tyr Asp Trp Leu Met Arg Arg 
165 lfo 175 

cac gat cgt gtg ate gcg cat cat gca tgc aga atg ggt tta gee ccg 576 
His Asp Arg vaT lie Ala His His Ala cys Arg Met Gly Leu Ala pro 
180 185 190 

cgt gaa aaa ctg cat cat tgt ttt tct cca ctg gca caa ate age cag 624 
Arg Glu Lys Leu His His Cys Phe ser Pro Leu Ala Gin He Ser Gin 
195 200 205 
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ttg ate ccc gaa ctg gat ttt ccc cgc aaa gcg ctg cca gac tgc ttt 672 
Leu lie Pro Glu Leu Asp Phe Pro Arg Lys Ala Leu Pro Asp Cys Phe 
210 21S 220 

cat gcg gtt gga ccg tta egg caa ccc cag ggg acg ccg ggg tea tea 720 
His Ala Val Gly Pro Leu Arg Gin Pro Gin Gly Thr Pro Gly Ser Ser 
225 230 235 . 240 

lu Z 5 ct l at t J t ccg tcc ccg gac aaa ccc cgt att ttt: Sec teg ctg 768 
Thr Ser Tyr Phe pro ser Pro Asp Lys Pro Arg lie Phe Ala Ser Leu 
245 250 255 

gac acc ctg cag gga cat cgt tat ggc ctg ttc agg acc ate gee aaa 816 
Gly Thr Leu Gin GTy His Arg Tyr GTy Leu Phe Arg Thr He Ala Lys 
260 265 270 

gee tgc gaa gag gtg gat gcg cag tta ctg ttg gca cac tgt ggc ggc 864 
Ala cys Glu Glu val Asp Ala Gin Leu Leu Leu Ala His cys Gly Gly 
275 280 285 



acc gat tac ccg cag cgt atg aca aaa att cag gec gca ttg cgt ctg 
Thr Asp Tyr Pro Gin Arg Met Thr Lys lie Gin Ala Ala Leu Arg Leu 
385 390 395 400 



912 



etc tea gee acg cag gca gqt gaa ctg gee egg ggc ggg gac att cag 
Leu -ser Ala Thr Gin Ala Gly Glu Leu Ala Arg GTy Gly Asp lie Gin 
290 295 300 

gtt gtg gat ttt gee gat caa tcc gca gca ctt tea cag gca cag ttg 960 
val vaT Asp Phe Ala Asp Gin ser Ala Ala Leu ser Gin Ala Gin Leu 
305 310 315 320 

?u a g ? t g 9 g atg aat acg gta ct 9 g ac g^ att get tcc 1008 

Thr He Thr His Gly Gly Met Asn Thr Val Leu Asp Ala He Ala ser 
325 330 335 

cgc aca ccg eta ctg gcg ctg ccg ctg gca ttt gat caa cct ggc gtg 1056 
Arg Thr Pro Leu Leu Ala Leu Pro Leu Ala Phe Asp Gin Pro Gly Val 
340 345 350 

gca tea cga att gtt tat cat ggc ate ggc aag cgt gcg tct egg ttt 1104 
Ala ser Arg He val Tyr His Gly lie GTy Lyi Arg Ala Ser Arg Phe 
355 360 365 

t& ?K S at ?? 9 ? tq ?? g cgg c ? g a J* cga tcg ctg ctg act aac H52 
Thr Thr Ser His Ala Leu Ala Arg Gin He Arg Ser Leu Leu Thr Asn 
370 375 380 



1200 



gca ggc ggc aca cca gec gee gee gat att gtt gaa cag gcg atg egg 1248 
Ala Gly Gly Thr Pro Ala Ala Ala Asp lie Val Glu Gin Ala Met Arg 
405 410 415 

5£ c H gt 5? g £ ca gt ? ctc agt g 9 g ca g gat tat g ca a « 9 C * eta tga 1296 
Thr cys Gin Pro Val Leu Ser GTy Gin Asp Tyr Ala Thr Ala Leu 
420 425 430 

<210> 4 ' 
<211> 431 
<212> PRT 

<213> Pantoea stewartii 
<400> 4 

Met Ser His Phe Ala Val lie Ala pro Pro Phe Phe ser His val Arq 
1 5 10 15 

Ala Leu Gin Asn Leu Ala Gin Glu Leu Val Ala Arg Gly His Arg val 
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20 25 30 

Thr Phe Phe Gin Gin His Asp Cys Lys Ala Leu Val Thr Gly Ser Asp 
35 40 45 

lie Gly Phe Gin Thr val Gly Leu Gin Thr His Pro pro Gly Ser Leu 
50 55 60 

ser His Leu Leu His Leu Ala Ala His Pro Leu Gly Pro Ser Met Leu 
65 70 75 80 

Arg Leu lie Asn Glu Met Ala Arg Thr Ser Asp Met Leu Cys Arg Glu 
85 90 95 

Leu Pro Ala Ala Phe His Ala Leu Gin lie Glu Gly Val lie Val Asp 
100 105 110 

Gin Met Glu Pro Ala Gly Ala Val Val Ala Glu Ala ser Gly Leu Pro 
115 120 125 

Phe val Ser val Ala Cys Ala Leu pro Leu Asn Arg Glu Pro Gly Leu 
130 135 140 

pro Leu Ala val Met Pro Phe Glu Tyr Gly Thr ser Asp Ala Ala Arg 
145 150 155 160 

Glu Arg Tyr Thr Thr Ser Glu Lys He Tyr Asp Trp Leu Met Arg Arg 
165 170 175 

His Asp Arg val lie Ala His His Ala Cys Arg Met Gly Leu Ala Pro 
180 185 190 

Arg Glu Lys Leu His His Cys Phe Ser Pro Leu Ala Gin He Ser Gin 
195 200 205 

Leu lie Pro Glu Leu Asp Phe Pro Arg Lys Ala Leu Pro Asp Cys Phe 
210 215 220 

His Ala Val Gly Pro Leu Arg Gin Pro Gin Gly Thr Pro Gly ser ser 
225 230 235 240 

Thr Ser Tyr Phe Pro Ser Pro Asp Lys Pro Arg lie Phe Ala ser Leu 
245 250 255 

Gly Thr Leu Gin Gly His Arg Tyr Gly Leu Phe Arg Thr lie Ala Lys 
260 265 270 

Ala Cys Glu Glu val Asp Ala Gin Leu Leu Leu Ala His cys Gly Gly 
275 280 285 

Leu ser Ala Thr Gin Ala Gly Glu Leu Ala Arg Gly Gly Asp lie Gin 
290 295 300 
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val val Asp Phe Ala Asp Gin ser Ala Ala Leu ser Gin Ala Gin Leu 
305 310 315 320 

Thr lie Thr His Gly Gly Met Asn Thr Val Leu Asp Ala lie Ala Ser 
325 330 335 

Arg Thr Pro Leu Leu Ala Leu Pro Leu Ala Phe Asp Gin Pro Gly val 
340 345 350 

Ala ser Arg lie val Tyr His Gly lie Gly Lys Arg Ala ser Arg Phe 
355 360 365 

Thr Thr Ser His Ala Leu Ala Arg Gin lie Arg ser Leu Leu Thr Asn 
370 375 380 

Thr Asp Tyr Pro Gin Arg Met Thr Lys lie Gin Ala Ala Leu Arg Leu 
385 390 395 400 

Ala Gly Gly Thr Pro Ala Ala Ala Asp He val Glu Gin Ala Met Arg 
405 410 415 

Thr Cys Gin Pro Val Leu Ser Gly Gin Asp Tyr Ala Thr Ala Leu 
420 ' 425 430 

<210> 5 
<211> 1149 
<212> DNA 

<213> Pantoea stewartii 
<220> 

<221> CDS 

<222> (1) . . (1149) 

<400> 5 

atg caa ccg cac tat gat etc att ctg gtc ggt gec ggt ctg get aat 

Met Gin Pro His Tyr Asp Leu lie Leu Val Gly Ala Gly Leu Ala Asn 

15 10 15 



ttg ctt att gag gcg ggt cct gag gcg gga ggg aac cat acc tgg tec 
Leu Leu lie Glu Ala Gly Pro Glu Ala Gly Gly Asn His Thr Trp ser 
35 40 45 



ctt gtg gtc cat cac tgg ccc gac tac cag gtt cgt ttc ccc caa cgc 

Leu Vai Val His His Trp Pro Asp Tyr Gin Val Arg Phe Pro Gin Arg 

65 70 75 80 

cgt cgc cat gtg aac agt ggc tac tac tgc gtg acc tec egg cat ttc 

Arg Arg His val Asn ser Gly Tyr Tyr cys Val Thr Ser Arg His Phe 
85 90 95 
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48 



ggc ctt ate gcg etc egg ctt cag caa cag cat ccg gat atg egg ate 96 
Gly Leu lie Ala Leu Arg Leu Gin Gin Gin His Pro Asp Met Arg lie 
20 25 30 



144 



ttt cac gaa gag gat tta acg ctg aat cag cat cgc tgg ata gcg ccg 192 
Phe His Glu Glu Asp Leu Thr Leu Asn Gin His Arg Trp lie Ala Pro 
50 55 60 



240 



288 



WO 2004/056975 PCT/US2003/041812 



gcc ggg ata etc egg caa cag ttt gga caa cat tta tgg ctg cat acc 
Ala Gly lie Leu Arg Gin Gin Phe Gly Gin His Leu Trp Leu His Thr 
100 105 110 



att att cat gcc agt aca gtg ate gac gga egg ggt tac acg cct gat 
lie lie His Ala ser Thr VaT lie Asp Giy Arg Gly Tyr Thr Pro Asp 
130 135 140 

tct gca eta cgc gta gga ttc cag gca ttt ate ggt cag gag tgg caa 
ser Ala Leu Arg val Gly Phe Gin Ala Phe lie GJy Gin Glu Trp Gin 
145 150 155 160 



336 



gcg gtt tea gcc gtt cat get gaa teg gtc cag tta gcg gat ggc egg 384 
Ala Val ser Ala Val His Ala Glu ser Val Gin Leu Ala Asp GTy Arg 
115 120 125 



432 



480 



ctg age gcg ccg cat ggt tta teg tea ccg att ate atg gat gcg acg 528 

Leu Ser Ala Pro His Gly Leu ser ser Pro lie lie Met Asp Ala Thr 

165 170 175 

gtc gat cag caa aat ggc tac cgc ttt gtt tat acc ctg ccg ctt tec 576 

val Asp Gin Gin Asn Gly Tyr Arg Phe Val Tyr Thr Leu Pro Leu ser 

180 185 190 

gca acc gca ctg ctg ate gaa gac aca cac tac att gac aag get aat 624 

Ala Thr Ala Leu Leu lie Glu Asp Thr His Tyr lie Asp Lys Ala Asn 
195 200 205 

ctt cag gcc gaa egg gcg cgt cag aac att cgc gat tat get gcg cga 672 

Leu Gin Ala Glu Arg Ala Arg Gin Asn lie Arg Asp Tyr Ala Ala Arg 
210 215 220 

cag ggt tgg ccg tta cag acg ttg ctg egg gaa gaa cag ggt gca ttg 720 

Gin Gly Trp Pro Leu. Gin Thr Leu Leu Arg Glu Glu Gin Gly Ala Leu 
225 230 235 240 

ccc att acg tta acg ggc gat aat cgt cag ttt tgg caa cag caa ccg 768 

Pro He Thr Leu Thr Gly Asp Asn Arg Gin Phe Trp Gin Gin Gin pro 

245 250 255 



caa gcc tgt age gga tta cgc gcc ggg ctg ttt cat ccg aca acc ggc 816 
Gin Ala cys ser Gly Leu Arg Ala Gly Leu Phe His Pro Thr Thr Gly 
260 " 265 270 



tac tec eta ccg etc gcg gtg gcg ctg gcc gat cgt etc age gcg ctg 864 

Tyr ser Leu Pro Leu Ala VaT Ala Leu Ala Asp Arg Leu ser Ala Leu 
275 280 285 

gat gtg ttt acc tct tec tct gtt cac cag acg att get cac ttt gcc 912 

Asp Val Phe Thr ser ser Ser val His Gin Thr lie Ala His Phe Ala 
290 295 300 

cag caa cgt tgg cag caa cag ggg ttt ttc cgc atg ctg aat cgc atg 960 

Gin Gin Arg Trp Gin Gin Gin Gly Phe Phe Arg Met Leu Asn Arg Met 
305 310 * 315 320 

ttg ttt tta gcc gga ccg gcc gag tea cgc tgg cgt gtg atg cag cgt 1008 

Leu Phe Leu Ala Gly Pro Ala Glu Ser Arg Trp Arg Val Met Gin Arg 

325 330 335 

ttc tat ggc tta ccc gag gat ttg att gcc cgc ttt tat gcg gga aaa 1056 

Phe Tyr Gly Leu Pro Glu Asp Leu lie Ala Arg Phe Tyr Ala Gly Lys 

340 345 350 

etc acc gtg acc gat egg eta cgc att ctg age ggc aag ccg ccc gtt 1104 

Leu Thr val Thr Asp Arg Leu Arg lie Leu ser Gly Lys Pro Pro val 
355 360 365 

ccc gtt ttc gcg gca ttg cag gca att atg acg act cat cgt tga 1149 
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Pro Val Phe Ala Ala Leu Gin Ala lie Met Thr Thr His Arg 
370 375 380 

<210> 6 

<211> 382 

<212> PRT 

<213> Pantoea stewartii 

<400> 6 

Met Gin Pro His Tyr Asp Leu lie Leu Val Gly Ala Gly Leu Ala Asn 
1 5 10 15 

Gly Leu lie Ala Leu Arg Leu Gin Gin Gin His Pro Asp Met Arg He 
20 25 30 

Leu Leu He Glu Ala Gly Pro Glu Ala Gly Gly Asn His Thr Trp Ser 
35 40 45 

Phe His Glu Glu Asp Leu Thr Leu Asn Gin His Arg Trp lie Ala Pro 
50 55 60 

Leu Val Val His His Trp Pro Asp Tyr Gin Val Arg Phe Pro Gin Arg 
65 70 75 80 

Arg Arg His Val Asn ser Gly Tyr Tyr cys Val Thr ser Arg His Phe 
85 90 95 

Ala Gly He Leu Arg Gin Gin Phe Gly Gin His Leu Trp Leu His Thr 
100 105 110 

Ala val Ser Ala val His Ala Glu Ser val Gin Leu Ala Asp Gly Arq 
115 120 125 t 

lie He His Ala ser Thr Val He Asp Gly Arg Gly Tyr Thr Pro Asp 
130 135 140 

Ser Ala Leu Arg Val Gly Phe Gin Ala Phe He Gly Gin Glu Trp Gin 
145 150 155 160 

Leu ser Ala pro His Gly Leu Ser Ser Pro lie lie Met Asp Ala Thr 
165 170 175 

Val Asp Gin Gin Asn Gly Tyr Arg Phe Val Tyr Thr Leu Pro Leu ser 
180 185 190 

Ala Thr Ala Leu Leu lie Glu Asp Thr His Tyr lie Asp Lys Ala Asn 
195 200 205 

Leu Gin Ala Glu Arg Ala Arg Gin Asn lie Arg Asp ,Tyr Ala Ala Arg 
210 215 220 

Gin Gly Trp pro Leu Gin Thr Leu Leu Arg Glu Glu Gin Gly Ala Leu 
225 230 235 240 
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Pro lie Thr Leu Thr Gly Asp Asn Arg Gin Phe Trp Gin Gin Gin Pro 
245 250 255 

Gin Ala cys Ser Gly Leu Arg Ala Gly Leu Phe His Pro Thr Thr Gly 
260 265 270 

Tyr ser Leu Pro Leu Ala val Ala Leu Ala Asp Arg Leu Ser Ala Leu 
275 280 285 

Asp Val Phe Thr ser Ser Ser Val His Gin Thr lie Ala His Phe Ala 
290 295 300 

Gin Gin Arg Trp Gin Gin Gin Gly Phe Phe Arg Met Leu Asn Arg Met 
305 310 315 320 

Leu Phe Leu Ala Gly Pro Ala Glu Ser Arg Trp Arg val Met Gin Arg 
325 330. 335 

Phe Tyr Gly Leu Pro Glu Asp Leu lie Ala Arg Phe Tyr Ala Gly Lys 
340 345 350 

Leu Thr val Thr Asp Arg Leu Arg lie Leu ser Gly Lys Pro Pro Val 
355 360 365 

Pro Val Phe Ala Ala Leu Gin Ala lie Met Thr Thr His Arg 
370 375 380 

<210> 7 

<211> 1479 

<212> DNA 

<213> Pantoea stewartii 
<220> 

<221> CDS 

<222> (1) . . (1479) 

<400> 7 

atg aaa cca act acg gta att ggt gcg ggc ttt ggt ggc ctg gca ctg 48 

Met Lys Pro Thr Thr val He Gly Ala Gly Phe Gly Gly Leu Ala Leu 
1 5 10 15 

gca att cgt tta cag gcc gca ggt att cct gtt ttg ctg ctt gag cag 96 
Ala lie Arg Leu Gin Ala Ala Gly He Pro val Leu Leu Leu Glu Gin 
20 25 30 



cgc gac aag ccg ggt ggc egg get tat gtt tat cag gag cag ggc ttt 
Arg Asp Lys Pro Gly Gly Arg Ala Tyr Val Tyr Gin Glu Gin Gly Phe 
35 40 45 



144 



act ttt gat gca ggc cct acc gtt ate acc gat ccc age gcg att gaa 192 
Thr Phe Asp Ala Gly Pro Thr Val lie Thr Asp Pro Ser Ala lie Glu 
50 55 60 

gaa ctg ttt get ctg gcc ggt aaa cag ctt aag gat tac gtc gag ctg 240 
Glu Leu Phe Ala Leu Ala Gly Lys Gin Leu Lys Asp Tyr Val Glu Leu 
65 70 75 80 
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ttg ccg gtc acg ccg ttt tat cgc ctg tgc tgg gag tec ggc aag gtc 288 
Leu pro Val Thr Pro Phe Tyr Arg Leu Cys Trp Glu ser Gly Lys val 
85 90 95 



ttc aat tac gat aac gac cag gec cag tta gaa gcg cag ata cag cag 336 
Phe Asn Tyr Asp Asn Asp Gin Ala Gin Leu Glu Ala Gin lie Gin Gin 
100 105 110 



ttt aat ccg cgc gat gtt gcg ggt tat cga gcg ttc ctt gac tat teg 384 
Phe Asn Pro Arg Asp val Ala Gly Tyr Arg Ala Phe Leu Asp Tyr Ser 
115 120 125 



cgt gec gta ttc aat gag ggc tat ctg aag etc ggc act gtg cct ttt 432 
Arg Ala Val Phe Asn Glu Gly Tyr Leu Lys Leu Gly Thr val Pro Phe 
130 . 135 140 



tta teg ttc aaa gac atg ctt egg gec gcg ccc cag ttg gca aag ctg 480 

Leu ser Phe Lys Asp Met Leu Arg Ala Ala Pro Gin Leu Ala Lys Leu 

145 150 155 160 

cag gca tgg cgc age gtt tac agt aaa gtt gee ggc tac att gag gat 528 

Gin Ala Trp Arg ser Val Tyr ser Lys val Ala Gly Tyr lie Glu Asp 
165 170 175 

gag cat ctt egg cag gcg ttt tct ttt cac teg etc tta gtg ggg ggg 576 

Glu His Leu Arg Gin Ala Phe ser Phe His ser Leu Leu Val Gly Gly 
180 185 190 

aat ccg ttt gca ace teg tec att tat acg ctg att cac gcg tta gaa 624 

Asn Pro Phe Ala Thr Ser Ser lie Tyr Thr Leu lie His Ala Leu Glu 
195 200 205 

egg gaa tgg ggc gtc tgg ttt cca cgc ggt gga ace ggt gcg ctg gtc 672 

Arg Glu Trp Gly val Trp Phe Pro Arg Gly Gly Thr Gly Ala Leu Val 
210 215 220 



aat ggc atg ate aag ctg ttt cag gat ctg ggc ggc gaa gtc gtg ctt 720 
Asn Gly Met lie Lys Leu Phe Gin Asp Leu Gly Gly Glu val val Leu 
225 230 235 240 



aac gec egg gtc agt cat atg gaa acc gtt ggg gac aag att cag gee 768 

Asn Ala Arg val ser His Met Glu Thr Val Gly Asp Lys lie Gin Ala 
245 250 255 

gtg cag ttg gaa gac ggc aga egg ttt gaa acc tgc gcg gtg gcg teg 816 

Val Gin Leu Glu Asp Gly Arg Arg Phe Glu Thr Cys Ala Val Ala ser 
260 265 270 

aac get gat gtt gta cat acc tat cgc gat ctg ctg tct cag cat ccc 864 

Asn Ala Asp Val Val His Thr Tyr Arg Asp Leu Leu Ser Gin His Pro 
275 280 285 

gca gee get aag cag gcg aaa aaa ctg caa tec aag cgt atg agt aac 912 

Ala Ala Ala Lys Gin Ala Lys Lys Leu Gin ser Lys Arg Met ser Asn 
290 295 300 

tea ctg ttt gta etc tat ttt ggt etc aac cat cat cac gat caa etc 960 

ser Leu Phe Val Leu Tyr Phe Gly Leu Asn His His His Asp Gin Leu 

305 310 315 320 

gee cat cat acc gtc tgt ttt ggg cca cgc tac cgt gaa ctg att cac 1008 

Ala His His Thr Val Cys Phe Gly Pro Arg Tyr Arg Glu Leu lie His 
325 330 335 

gaa att ttt aac cat gat ggt ctg get gag gat ttt teg ctt tat tta 1056 

Glu He Phe Asn His Asp Gly Leu Ala Glu Asp Phe ser Leu Tyr Leu 
340 345 350 

cac gca cct tgt gtc acg gat ccg tea ctg gca ccg gaa ggg tgc ggc 1104 
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His Ala Pro Cys Val Thr Asp Pro Ser Leu Ala Pro Glu Gly Cys Gly 
355 360 365 



age tat tat gtg ctg gcg cct gtt cca cac tta ggc acg gcg aac etc 1152 
ser Tyr Tyr Val Leu Ala Pro val Pro His Leu Gly Thr Ala Asn Leu 
370 375 380 



gac tgg gcg gta gaa gqa ccc cga ctg cgc gat cgt att ttt gac tac 1200 

Asp Trp Ala val Glu Gly pro Arg Leu Arg Asp Arg He Phe Asp Tyr 
385 390 395 400 

ctt gag caa cat tac atg cct ggc ttg cga age cag ttg gtg acg cac 1248 

Leu Glu Gin His Tyr Met Pro Gly Leu Arg ser Gin Leu Val Thr His 

405 410 415 

cgt atg ttt acg ccg ttc gat ttc cgc gac gag etc aat gec tgg caa 1296 

Arg Met Phe Thr Pro Phe Asp Phe Arg Asp Glu Leu Asn Ala Trp Gin 

420 425 430 

ggt teg gec ttc teg gtt gaa cct att ctg ace cag age gee tgg ttc 1344 

Gly Ser Ala Phe Ser Val Glu Pro lie Leu Thr Gin Ser Ala Trp Phe 

435 440 445 

cga cca cat aac cgc gat aag cac att gat aat ctt tat ctg gtt ggc 

Arg Pro His Asn Arg Asp Lys His lie Asp Asn Leu Tyr Leu Val Gly 

450 455 460 



1392 



gca ggc acc cat cct ggc gcg ggc att ccc ggc gta ate ggc teg gcg 1440 
Ala Gly Thr His Pro Gly Ala Gly lie Pro Gly Val lie Gly Ser Ala 
465 470 475 480 

aag gcg acg gca ggc tta atg ctg gag gac ctg att tga 1479 
Lys Ala Thr Ala Gly Leu Met Leu Glu Asp Leu lie 
485 490 

<210> 8 
<211> 492 
<212> PRT 

<213> Pantoea steward i 
<400> 8 

Met Lys Pro Thr Thr Val lie Gly Ala Gly Phe Gly Gly Leu Ala Leu 
15 10 15 

Ala He Arg Leu Gin Ala Ala Gly lie Pro Val Leu Leu Leu Glu Gin 
20 25 30 

Arg Asp Lys Pro Gly Gly Arg Ala Tyr Val Tyr Gin Glu Gin Gly Phe 
35 40 45 

Thr Phe Asp Ala Gly Pro Thr Val lie Thr Asp Pro Ser Ala lie Glu 
50 55 60 

Glu Leu Phe Ala Leu Ala Gly Lys Gin Leu Lys Asp Tyr Val Glu Leu 
65 70 75 80 

Leu Pro val Thr Pro Phe Tyr Arg Leu cys Trp Glu ser Gly Lys Val 
85 90 95 

Phe Asn Tyr Asp Asn Asp Gin Ala Gin Leu Glu Ala Gin lie Gin Gin 
100 105 110 
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Phe Asn Pro Arg Asp Val Ala Gly Tyr Arg Ala Phe Leu Asp Tyr Ser 
115 120 125 



Arg Ala val Phe Asn Glu Gly Tyr Leu Lys Leu Gly Thr Val Pro Phe 
130 135 140 



Leu ser Phe Lys Asp Met Leu Arg Ala Ala Pro Gin Leu Ala Lys Leu 
145 150 155 160 



Gin Ala Trp Arg ser val Tyr ser Lys Val Ala Gly Tyr lie Glu Asp 
165 170 175 



Glu His Leu Arg Gin Ala Phe ser Phe His Ser Leu Leu val 



Asn Pro Phe Ala Thr Ser Ser lie Tyr Thr Leu lie His Ala Leu Glu 
195 200 205 



Arg Glu Trp Gly Val Trp Phe Pro Arg Gly Gly Thr Gly Ala Leu val 
210 215 220 



Asn Gly Met He Lys Leu Phe Gin Asp Leu Gly Gly Glu val Val Leu 
225 230 235 240 



Asn Ala Arg val Ser His Met Glu Thr Val Gly Asp Lys He Gin Ala 
245 250 255 



Val Gin Leu Glu Asp Gly Arg Arg phe Glu Thr cys Ala Val Ala ser 
260 265 270 



Asn Ala Asp val Val His Thr Tyr Arg Asp Leu Leu ser Gin His Pro 
275 280 285 



Ala Ala Ala Lys Gin Ala Lys Lys Leu Gin Ser Lys Arg Met Ser Asn 
290 295 300 



Ser Leu Phe Val Leu Tyr Phe Gly Leu Asn His His His Asp Gin Leu 
305 310 315 320 



Ala His His Thr Val Cys Phe Gly pro Arg Tyr Arg Glu Leu lie His 
325 330 335 



Glu He Phe Asn His Asp Gly Leu Ala Glu Asp Phe Ser Leu Tyr Leu 
340 345 350 



His Ala Pro Cys Val Thr Asp Pro ser Leu Ala Pro Glu Gly Cys Gly 
355 360 365 



Ser Tyr Tyr Val Leu Ala Pro val Pro His Leu Gly Thr Ala Asn Leu 




185 



190 




375 



380 
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Asp Trp Ala val Glu Gly Pro Arg Leu Arg Asp Arg lie Phe Asp Tyr 
385 390 395 400 

Leu Glu Gin His Tyr Met Pro Gly Leu Arg Ser Gin Leu Val Thr His 
405 410 415 

Arg Met Phe Thr Pro Phe Asp Phe Arg Asp Glu Leu Asn Ala Trp Gin 
420 425 430 

Gly Ser Ala Phe ser Val Glu Pro lie Leu Thr Gin Ser Ala Trp Phe 
435 440 445 

Arg Pro His Asn Arg Asp Lys His He Asp Asn Leu Tyr Leu Val Gly 
450 455 460 

Ala Gly Thr His Pro Gly Ala Gly lie Pro Gly val lie Gly ser Ala 
465 470 475 480 

Lys Ala Thr Ala Gly Leu Met Leu Glu Asp Leu lie 
485 490 

<210> 9 
<211> 891 
<212> DNA 

<213> Pantoea stewartii 
<220> 

<221> CDS 
<222> (1) . . (891) 

<400> 9 

atg gcg gtt ggc teg aaa age ttt gcg act gca teg acg ctt ttc gac 48 

Met Ala val Gly ser Lys Ser Phe Ala Thr Ala ser Thr Leu Phe Asp 
1 5 10 15 

gee aaa ace cgt cgc age gtg ctg atg ctt tac gca tgg tgc cgc cac 96 
Ala Lys Thr Arg Arg Ser Val Leu Met Leu Tyr Ala Trp Gys Arg His 
20 25 30 



tgc gac gac gtc att gac gat caa aca ctg ggc ttt cat gee gac cag 
cys Asp Asp val lie Asp Asp Gin Thr Leu Gly Phe His Ala Asp Gin 
35 40 45 



144 



ccc tct teg cag atg cct gag cag cgc ctg cag cag ctt gaa atg aaa 192 

Pro Ser Ser Gin Met Pro Glu Gin Arg Leu Gin Gin Leu Glu Met Lys 
50 55 60 

acg cgt cag gee tac gee ggt teg caa atg cac gag ccc get ttt gee 240 

Thr Arg Gin Ala Tyr Ala Gly ser Gin Met His Glu Pro Ala Phe Ala 
65 70 75 80 

gcg ttt cag gag gtc gcg atg gcg cat gat ate get ccc gee tac gcg 288 

Ala Phe Gin Glu val Ala Met Ala His Asp lie Ala pro Ala Tyr Ala 

85 90 95 

ttc gac cat ctg gaa ggt ttt gee atg gat gtg cgc gaa acg cgc tac 336 

Phe Asp His Leu Glu Gly Phe Ala Met Asp Val Arg Glu Thr Arg Tyr 

100 105 110 

ctg aca ctg gac gat acg ctg cgt tat tgc tat cac gtc gee ggt gtt 384 
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Leu 



Thr 



Leu 
115 



ASp 



ASp 



Thr 



Leu 




Tyr 



Cys 



Tyr 



His 



val 
125 



Ala 



Gly 



Val 



gtg ggc ctg atg atg gcg caa att atg ggc gtt cgc gat aac gcc acg 
Val Gly Leu Met Met Ala Gin lie Met Gly Val Arg Asp Asn Ala Thr 
130 135 140 

etc gat cgc gcc tgc gat etc ggg ctg get ttc cag ttg acc aac att 
Leu Asp Arg Ala Cys Asp Leu Gly Leu Ala Phe Gin Leu Thr Asn lie 
145 150 155 160 

gcg cgt gat att gtc gac gat get cag gtg ggc cgc tgt tat ctg cct 
Ala Arg Asp lie Val Asp Asp Ala Gin Val Gly Arg cys Tyr Leu Pro 
165 170 175 

gaa age tgg ctg gaa gag gaa gga ctg acg aaa gcg aat tat get gcg 
Glu ser Trp Leu Glu Glu Glu Gly Leu Thr Lys Ala Asn Tyr Ala Ala 
180 185 190 

cca gaa aac egg cag gcc tta age cgt ate gcc ggg cga ctg gta egg 
Pro Glu Asn Arg Gin Ala Leu ser Arg lie Ala Gly Arg Leu Val Arg 
195 200 205 

gaa gcg gaa ccc tat tac gta tea tea atg gcc ggt ctg gca caa tta 
Glu Ala Glu Pro Tyr Tyr val ser Ser Met Ala Gly Leu Ala Gin Leu 
210 215 220 

ccc tta cgc teg gcc tgg gcc ate gcg aca gcg aag cag gtg tac cgt 
Pro Leu Arg ser Ala Trp Ala lie Ala Thr Ala Lys Gin Val Tyr Arg 
225 230 235 240 

aaa att ggc gtg aaa gtt gaa cag gcc ggt aag cag gcc tgg gat cat 
Lys lie Gly val Lys Val Glu Gin Ala Gly Lys Gin Ala Trp Asp His 
245 250 255 

cgc cag tec acg tec acc gcc gaa aaa tta acg ctt ttg ctg acg gca 
Arg Gin Ser Thr Ser Thr Ala Glu Lys Leu Thr Leu Leu Leu Thr Ala 



tec ggt cag gca gtt act tec egg atg aag acg tat cca ccc cgt cct 
Ser Gly Gin Ala Val Thr ser Arg Met Lys Thr Tyr Pro Pro Arg Pro 
275 280 285 

get cat etc tgg cag cgc ccg ate tag 
Ala His Leu Trp Gin Arg Pro lie 
290 295 



<210> 10 

<211> 296 

<212> PRT 

<213> Pantoea stewartii 

<400> 10 

Met Ala Val Gly ser Lys ser Phe Ala Thr Ala Ser Thr Leu Phe Asp 
1 5 10 15 



Ala Lys Thr Arg Arg Ser Val Leu Met Leu Tyr Ala Trp Cys Arg His 
20 25 30 



cys Asp Asp val lie Asp Asp Gin Thr Leu Gly Phe His Ala Asp Gin 
35 40 45 



Pro ser Ser Gin Met Pro Glu Gin Arg Leu Gin Gin Leu Glu Met Lys 



432 
480 
528 
576 
624 
672 
720 
768 
816 
864 
891 



50 



55 



60 
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Thr Arg Gin Ala Tyr Ala Gly ser Gin Met His Glu pro Ala Phe Ala 
65 70 75 80 

Ala Phe Gin Glu Val Ala Met Ala His Asp He Ala Pro Ala Tyr Ala 
85 90 95 

Phe Asp His Leu Glu Gly Phe Ala Met Asp Val Arg Glu Thr Arg Tyr 
100 105 ^ 110 

Leu Thr Leu Asp Asp Thr Leu Arg Tyr cys Tyr His val Ala Gly val 
115 120 125 

val Gly Leu Met Met Ala Gin lie Met Gly Val Arg Asp Asn Ala Thr 
130 135 140 

Leu Asp Arg Ala cys Asp Leu Gly Leu Ala Phe Gin Leu Thr Asn lie 
145 150 155 160 

Ala Arg Asp lie Val Asp Asp Ala Gin val Gly Arg cys Tyr Leu Pro 
165 170 175 

Glu Ser Trp Leu Glu Glu Glu Gly Leu Thr Lys Ala Asn Tyr Ala Ala 
180 185 190 

Pro Glu Asn Arg Gin Ala Leu ser Arg lie Ala Gly Arg Leu val Arg 
195 200 205 

Glu Ala Glu pro Tyr Tyr Val ser ser Met Ala Gly Leu Ala Gin Leu 
210 215 220 

Pro Leu Arg ser Ala Trp Ala He Ala Thr Ala Lys Gin val Tyr Arg 
225 230 235 240 

Lys lie Gly Val Lys val Glu Gin Ala Gly Lys Gin Ala Trp Asp His 
245 250 25§ 

Arg Gin ser Thr ser Thr Ala Glu Lys Leu Thr Leu Leu Leu Thr Ala 
260 265 270 

ser Gly Gin Ala val Thr ser Arg Met Lys Thr Tyr Pro Pro Arg Pro 
275 280 285 

Ala His Leu Trp Gin Arg Pro lie 
290 295 

<210> 11 
<211> 528 
<212> DNA 

<213> pantoea stewartii 
<220> 
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<221> CDS 

<222> (1). .(528) 

<400> 11 

atg ttg tgg att tgg aat gcc ctg ate gtg ttt gtc acc gtg gtc ggc 48 
Met Leu Trp He Trp Asn Ala Leu lie VaT Phe val Thr vaT val GTy 
15 10 15 

atg gaa gtg gtt get gca ctg gca cat aaa tac ate atg cac ggc tgg 96 

• Glu vaT -* 1 ■ --' * - ■- • - 



Met Glu VaT val Ala Ala Leu Ala His Lys Tyr lie Met His GTy Trp 
20 25 30 

ggt tgg ggc tgg cat ctt tea cat cat gaa ccg cgt aaa ggc gca ttt 
GTy Trp GTy Trp His Leu Ser His His Glu Pro Arg Lys GTy Ala Phe 
35 40 45 



ctg att tac ttc ggc agt aca gga ate tgg ccg etc cag tgg att ggt 
Leu lie Tyr Phe GTy ser Thr GTy lie Trp Pro Leu Gin Trp lie GTy 
65 70 75 80 



144 



gaa gtt aac gat etc tat gcc gtg gta ttc gcc att gtg teg att gcc 192 
Glu Val Asn Asp Leu Tyr Ala VaT Val Phe Ala lie VaT ser He Ala 
50 55 60 



240 



384 



432 



gca ggc atg acc get tat ggt tta ctg tat ttt atg gtc cac gac gga 288 
Ala GTy Met Thr Ala Tyr GTy Leu Leu Tyr Phe Met Val His Asp GTy 
85 90 95 

ctg gta cac cag cgc tgg ccg ttc cgc tac ata ccg cgc aaa ggc tac . 336 
Leu val His Gin Arg Trp Pro Phe Arg Tyr lie Pro Arg Lys GTy Tyr 
100 105 110 

ctg aaa egg tta tac atg gcc cac cgt atg cat cat get gta agg gga 
Leu Lys Arg Leu Tyr Met Ala His Arg Met His His Ala Val Arg GTy 
115 120 125 

aaa gag ggc tgc gtg tec ttt ggt ttt ctg tac gcg cca ccg tta tct 
Lys Glu Gly cys VaT Ser Phe GTy Phe Leu Tyr Ala Pro Pro Leu Ser 
130 135 140 

aaa ctt cag gcg acg ctg aga gaa agg cat gcg get aga teg ggc get 480. 
Lys Leu Gin Ala Thr Leu Arg Glu Arg His Ala Ala Arg Ser GTy Ala 
145 150 155 160 

gcc aga gat gag cag gac gag gtg gat acg tct tea tec ggg aag taa 528 
Ala Arg Asp Glu Gin Asp GTy VaT Asp Thr Ser ser ser GTy Lys 
165 170 175 

<210> 12 
<211> 175 
<212> PRT 

<213> Pantoea stewartii 
<400> 12 

Met Leu Trp lie Trp Asn Ala Leu lie val Phe Val Thr Val Val Gly 
1 5 10 15 

Met Glu val Val Ala Ala Leu Ala His Lys Tyr lie Met His Gly Trp 
20 25 30 

Gly Trp Gly Trp His Leu Ser His His Glu pro Arg Lys Gly Ala Phe 
35 40 45 

Glu val Asn Asp Leu Tyr Ala val Val Phe Ala He Val Ser lie Ala 
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50 55 60 

Leu He Tyr Phe Gly ser Thr Gly lie Trp pro Leu Gin Trp lie Gly 
65 70 75 80 

Ala Gly Met Thr Ala Tyr Gly Leu Leu Tyr Phe Met Val His Asp Gly 
85 90 95 

Leu val His Gin Arg Trp Pro Phe Arg Tyr lie Pro Arg Lys Gly Tyr 
100 105 110 

Leu Lys Arg Leu Tyr Met Ala His Arg Met His His Ala val Arg Gly 
115 120 125 

Lys Glu Gly Cys val ser Phe Gly phe Leu Tyr Ala Pro Pro Leu ser 
130 135 140 

Lys Leu Gin Ala Thr Leu Arg Glu Arg His Ala Ala Arg ser Gly Ala 
145 150 155 160 

Ala Arg Asp Glu Gin Asp Gly val Asp Thr Ser ser Ser Gly Lys 
165 170 175 



<210> 13 

<211> 1860 

<212> DNA 

<213> Methyl omonas 16a 



<220> 

<221> CDS 

<222> (1) . . (I860) 



<400> 13 

atg get ctt tec aaa gac ttc cct eta etc aat tec ate cac acc cca 48 

Met Ala Leu Ser Lys Asp Phe Pro Leu Leu Asn ser lie His Thr Pro 

1 5 10 15 

gcg gac ata cgc gcg ctg tec aag gac cag etc cag caa ctg get gac 96 
Ala Asp He Arg Ala Leu Ser Lys Asp Gin Leu Gin Gin Leu Ala Asp 
20 25 30 

gag gtg cgc ggc tat ctg acc cac acg gtc age att tec ggc ggc cat 144 
Glu Val Arg Gly Tyr Leu Thr His Thr Val Ser lie ser Gly Gly His 
35 40 .45 

ttt gcg gee ggc etc ggc acc gtg gaa ctg acc gtg gcc ttg cat tat 192 
Phe Ala Ala Gly Leu Gly Thr Val Glu Leu Thr val Ala Leu His Tyr 
50 55 60 

gtg ttc aat acc ccc gtc gat cag ttg gtc tgg gac gtg ggc cat cag 240 
Val Phe Asn Thr Pro Val Asp Gin Leu val Trp Asp Val Gly His Gin 
65 70 75 80 

gcc tat ccg cac aag att ctg acc ggt cgc aag gag cgc atg ccg acc 288 
Ala Tyr Pro His Lys lie Leu Thr Gly Arg Lys Glu Arg Met Pro Thr 
85 90 95 

att cgc acc ctg ggc ggg gtg tea gcc ttt ccg gcg egg gac gag age 336 
He Arg Thr Leu Gly Gly Val ser Ala Phe Pro Ala Arg Asp Glu ser 
100 105 110 
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gaa tac gat gcc ttc ggc gtc 
Glu Tyr Asp Ala Phe Gly Val 
115 

gca ctg ggc atg gcc att gcg 
Ala Leu Gly Met Ala lie Ala 
130 135 



ggc cat tec age acc teg 
Gly His Ser Ser Thr ser 
120 125 

teg cag ctg cgc ggc gaa 
ser Gin Leu Arg Gly Glu 
140 



ate age gcg 
lie Ser Ala 



gac aag aag 
Asp Lys Lys 



atg gta gcc ate ate ggc gac 

^ " Gly 
145 150 



, ggt tec ate acc ggc ggc atg gcc tat 
Met Val Ala lie lie Gly Asp Gly ser He Thr Gly Gly Met Ala Tyr 

155 160 



gag gcg atg aat cat gcc ggc 
Glu Ala Met Asn His Ala Gly 
165 

ttg aac gac aac gat atg teg 
Leu Asn Asp Asn Asp Met Ser 
180 

aat tat ctg acc aag gtg ttg 
Asn Tyr Leu Thr Lys Val Leu 
195 

gaa gag age aag aaa get ctg 
Glu Glu ser Lys Lys Ala Leu 
210 . 215 

gcg cgc aag acc gag gaa cac 
Ala Arg Lys Thr Glu Glu His 
225 230 

ttg ttc gag gaa ttg ggc ttc 
Leu Phe Glu Glu Leu Gly Phe 
245 

gat gtc gag atg ctg gtg teg 
Asp val Glu Met Leu val Ser 
260 

ggg ccg gta ttc ctg cat gtg 
Gly Pro Val phe Leu His Val 
275 

cca gcc gag aaa gac ccg ttg 
Pro Ala Glu Lys Asp Pro Leu 
290 295 

ccg acc aag gat ttc ctg ccc 
Pro Thr Lys Asp phe Leu Pro 
305 310 

tat acc gag gtg ttc ggc cgc 
Tyr Thr Glu Val Phe Gly Arg 
325 

gag cgc ttg ctg ggc ate acg 
Glu Arg Leu Leu Gly lie Thr 
340 

gtg gaa ttc tea cag aaa ttt 
val Glu Phe ser Gin Lys Phe 
355 

gcc gag cag cat gcg gtg acc 
Ala Glu Gin His Ala val Thr 
370 375 



gat gtg aat gcc aac ctg 
Asp Val Asn Ala Asn Leu 
170 

ate teg ccg ccg gtc ggg 
lie ser Pro Pro Val Gly 
185 



ctg gtg ate 
Leu val lie 
175 

gcg atg aac 
Ala Met Asn 
190 



teg age aag ttt tat teg 
Ser ser Lys Phe Tyr ser ser Val Arg 
200 205 



teg gtg egg 

*al 



gcc aag atg ccg teg gtg 

Ala Lys Met Pro ser Val 
220 

gtg aag ggc atg ate gtg 

Val Lys Gly Met lie Val 
235 

aat tat ttc ggc ccg ate 

Asn Tyr Phe Gly Pro He 
250 

acc ctg gaa aat ctg aag 

Thr Leu Glu Asn Leu Lys 
265 



tgg gaa ctg 
Trp Glu Leu 



ccc ggt acc 
Pro Gly Thr 
240 

gac ggc cat 
Asp Gly His 
255 

gat ttg acc 
Asp Leu Thr 
270 



gtg acc aag aag ggc aaa 
val Thr Lys Lys Gly Lys Gly Tyr Ala 



280 



285 



ggc tat gcg 

Slv " 



gcc tac cat ggc gtg ccg 
Ala Tyr His Gly val pro 
300 

aag gcg gcg ccg teg ccg 
Lys Ala Ala Pro Ser Pro 
315 

tgg ctg tgc gac atg gcg 
Trp Leu cys Asp Met Ala 
330 

ccg gcg atg cgc gaa ggc 
Pro Ala Met Arg Glu Gly 
345 

ccg aat cgc tat ttc gat 
Pro Asn Arg Tyr Phe Asp 
360 365 



get ttc gat 
Ala Phe Asp 



cat ccg acc 
His pro Thr 
320 

get caa gac 
Ala Gin Asp 
335 

tct ggt ttg 
Ser Gly Leu 
350 

gtc gcc ate 
val Ala lie 



ttg gcc gcc ggc cag gcc tgc cag ggc 
Leu Ala Ala Gly Gin Ala Cys Gin Gly 
380 



384 
432 
480 
528 
576 
624 
672 
720 
768 
816 
864 
912 
960 
1008 
1056 
1104 
1152 
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gcc aag ccg 
Ala Lys Pro 
385 

gat cag ttg 
Asp Gin Leu 



gca ctg gat 
Ala Leu Asp 



ggc gcc ttt 
Gly Ala Phe 
435 

atg get cca 
Met Ala Pro 
450 

ttc caa cac 
Phe Gin His 
465 

ccc ggg gcg 
Pro Gly Ala 



gcc gaa gtc 
Ala Glu Val 



age atg gtc 
ser Met Val 
515 

gtg gtg aac 

1 Val 



gtg gtg 
Val Val 



ate cac 
lie His 
405 

cgt gcc 
Arg Ala 
420 

gat tac 
Asp Tyr 



gcc gac 
Ala Asp 



cat ggc 
His Gly 



gca ate 
Ala He 
485 

aga cac 
Arg His 
500 

acg cct 
Thr Pro 



gcg att tat tec 
Ala He Tyr ser 
390 

gac gtg gcc ttg 
Asp val Ala Leu 



ggc ttg gtc ggc 
Gly Leu val Gly 
425 

age tac atg cgc 
Ser Tyr Met Arg 
440 

gag aac gag tgc 
Glu Asn Glu cys 
455 

ccg get teg gtg 
Pro Ala ser Val 
470 

gat ccg ace ctg 
Asp Pro Thr Leu 



ace ttc 
Thr Phe 
395 

cag aac 
Gin Asn 
410 

ccg gat 
Pro Asp 



ctg caa 
Leu Gin 



tta gat 
Leu Asp 



gga ccg 
Gly pro 



Va 



530 



Asn 



gaa ttg gcc 
Glu Leu Ala 
545 

ate gcc ggc 
He Ala Gly 



aag gtg ctg 
Lys Val Leu 

gag caa ggt 
Glu Gin Gly 
595 

aag ggc ate 
Lys Gly He 
610 



cac ggc age cgc 
His Gly Ser Arg 
505 

gcc gtc gaa gcc 
Ala val Glu Ala 
520 

atg cgt ttc gtc aag ccg 
Met Arg Phe Val Lys Pro 
535 

agg acg cac gat gtg ttc 
Arg Thr His Asp val Phe 
550 

ggc agt gcg. ate 
Gly ser Ala lie 



tgt att 
Cys He 



agg cag 
Arg Gin 



cgc tat 
Arg Tyr 
475 

ace gcg 
Thr Ala 
490 

ate gcc 
lie Ala 



ccg aac 
Pro Asn 
445 

atg ctg 
Met Leu 
460 

ccg cgc 
Pro Arg 



ggc get 
Gly Ala 
565 

atg ccg 
Met Pro 
580 

agt cgc 
ser Arg 



ttc gcc 
Phe Ala 



ggc aag 
Gly Lys 



ttc gat 
Phe Asp 



gtc acc 
val Thr 
555 

aac ace 
Asn Thr 
570 



ctg gag 
Leu Glu 



att ctg 
lie Leu 



cag ctg 
Gin Leu 
525 

caa gcc 
Gin Ala 
540 

gtc gag 
val Glu 



cgc ggt tac 
Arg Gly Tyr 
400 

atg etc ttt 
Met Leu Phe 
415 

acc cat get 
Thr His Ala 
430 

atg ctg ate 
Met Leu lie 



acc acc ggc 
Thr Thr Gly 



ggc aaa ggg 
Gly Lys Gly 
480 

ate ggc aag 
lie Gly Lys 
495 

gcc tgg ggc 
Ala Trp Gly 
510 

ggc gcg acg 
Gly Ala Thr 



ttg gtg ctg 
Leu Val Leu 



gtc tgc aac ate 
Val cys Asn He Gly Leu 
585 



ggc ctg 



gag gaa ttg etc age ctg 
Glu Glu Leu Leu Ser Leu 
600 

acc ate gaa cag ttt tgc 
Thr lie Glu Gin Phe cys 
615 



ttc. ctg 
Phe Leu 



ccc gac 
Pro Asp 



gtc ggc 
Val Gly 
605 

get 

Ala 
620 



gaa aac gtc 
Glu Asn val 
560 

cag gcg cag 
Gin Ala Gin 
575 

cgc ttc gtc 
Arg Phe Val 
590 

etc gac age 
Leu Asp ser 



1200 
1248 
1296 
1344 
1392 
1440 
1488 
1536 
1584 
1632 
1680 
1728 
1776 
1824 
1860 



<210> 14 
<211> 620 
<212> PRT 

<213> Methyl omonas 16a 
<400> 14 

Met Ala Leu ser Lys Asp Phe pro Leu Leu Asn ser lie His Thr pro 
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15 10 15 

Ala Asp lie Arg Ala Leu Ser Lys Asp Gin Leu Gin Gin Leu Ala Asp 
20 25 30 

Glu val Arg Gly Tyr Leu Thr His Thr val Ser lie Ser Gly Gly His 
35 40 45 

phe Ala Ala Gly Leu Gly Thr Val Glu Leu Thr val Ala Leu His Tyr 
50 55 60 

Val Phe Asn Thr Pro Val Asp Gin Leu val Trp Asp val Gly His Gin 
65 70 75 80 

Ala Tyr Pro His Lys He Leu Thr Gly Arg Lys Glu Arg Met Pro Thr 
85 90 " 95 

lie Arg Thr Leu Gly Gly Val Ser Ala Phe Pro Ala Arg Asp Glu ser 
100 105 110 

Glu Tyr Asp Ala Phe Gly Val Gly His Ser Ser Thr Ser lie ser Ala 
115 120 125 

Ala Leu Gly Met Ala lie Ala Ser Gin Leu Arg Gly Glu Asp Lys Lys 
130 135 140 

Met val Ala lie lie Gly Asp Gly ser He Thr Gly Gly Met Ala Tyr 
145 150 155 160 

Glu Ala Met Asn His Ala Gly Asp Val Asn Ala Asn Leu Leu Val lie 
165 170 175 

Leu Asn Asp Asn Asp Met Ser lie ser Pro Pro val Gly Ala Met Asn 
180 185 190 

Asn Tyr Leu Thr Lys Val Leu Ser ser Lys Phe Tyr Ser Ser val Arg 
195 200 205 

Glu Glu ser Lys Lys Ala Leu Ala Lys Met Pro ser Val Trp Glu Leu 
210 215 220 

Ala Arg Lys Thr Glu Glu His Val Lys Gly Met He Val Pro Gly Thr 
225 230 235 240 

Leu Phe Glu Glu Leu Gly Phe Asn Tyr Phe Gly Pro He Asp Gly His 
245 250 255 

Asp val Glu Met Leu val Ser Thr Leu Glu Asn Leu Lys Asp Leu Thr 
260 265 270 

Gly Pro val phe Leu His Val val Thr Lys Lys Gly Lys Gly Tyr Ala 
275 280 285 
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pro Ala Glu Lys Asp Pro Leu Ala Tyr His Gly val Pro Ala Phe Asp 
290 295 300 

Pro Thr Lys Asp Phe Leu Pro Lys Ala Ala Pro Ser Pro His Pro Thr 
305 310 315 320 

Tyr Thr Glu Val Phe Gly Arg Trp Leu cys Asp Met Ala Ala Gin Asp 
325 330 335 

Glu Arg Leu Leu Gly lie Thr Pro Ala Met Arg Glu Gly Ser Gly Leu 
340 345 350 

val Glu Phe ser Gin Lys Phe Pro Asn Arg Tyr Phe Asp val Ala He 
355 360 365 

Ala Glu Gin His Ala Val Thr Leu Ala Ala Gly Gin Ala Cys Gin Gly 
370 375 380 

Ala Lys Pro Val Val Ala lie Tyr ser Thr Phe Leu Gin Arg Gly Tyr 
385 390 395 400 

Asp Gin Leu He His Asp Val Ala Leu Gin Asn Leu Asp Met Leu Phe 
405 410 415 

Ala Leu Asp Arg Ala Gly Leu Val Gly Pro Asp Gly Pro Thr His Ala 
420 425 430 

Gly Ala Phe Asp Tyr ser Tyr Met Arg Cys lie Pro Asn Met Leu lie 
435 440 445 

Met Ala Pro Ala Asp Glu Asn Glu Cys Arg Gin Met Leu Thr Thr Gly 
450 455 460 

Phe Gin His His Gly Pro Ala ser Val Arg Tyr Pro Arg Gly Lys Gly 
465 470 475 480 

Pro Gly Ala Ala lie Asp Pro Thr Leu Thr Ala Leu Glu lie Gly Lys 
485 490 495 

Ala Glu val Arg His His Gly Ser Arg He Ala lie Leu Ala Trp Gly 
500 505 510 

Ser Met Val Thr Pro Ala Val Glu Ala Gly Lys Gin Leu Gly Ala Thr 
515 520 525 

Val val Asn Met Arg Phe Val Lys Pro Phe Asp Gin Ala Leu val Leu 
530 535 540 

Glu Leu Ala Arg Thr His Asp val Phe Val Thr Val Glu Glu Asn Val 
545 550 555 560 
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lie Ala Gly Gly Ala Gly Ser Ala lie Asn Thr Phe Leu Gin Ala Gin 
565 570 575 

Lys val Leu Met pro val cys Asn lie Gly Leu pro Asp Arg Phe val 
580 585 590 

Glu Gin Gly Ser Arg Glu Glu Leu Leu Ser Leu Val Gly Leu Asp Ser 
595 " 600 605 

Lys Gly lie Phe Ala Thr He Glu Gin Phe cys Ala 
610 615 620 

<210> 15 

<211> 982 

<212> DNA 

<213> Methyl omonas 16a 
<220> 

<221> CDS 

<222> (22).. (975) 

<400> 15 

cccagtaaaa cactcaagaa t atg caa ate gta etc gca aac ccc cgt gga 

Met Gin lie val Leu Ala Asn Pro Arg Gly 
1 5 10 



gaa gec ttt ggt gcg ccg att tat gtg egg cac gag gtg gtg cat 
Glu Ala Phe Gly Ala Pro lie Tyr Val Arg His Glu Val Val His Asn 



51 



ttc tgt gec ggc gtg gac egg gec att gaa att gtc gat caa gee ate 99 
Phe Cys Ala Gly Val Asp Arg Ala He Glu He Val Asp Gin Ala lie 
15 20 25 



aac 147 

_ . His 

30 35 40 



cgc acc gtg gtc gat gga ctg aaa caa aaa ggt gcg gtg ttc ate gag 195 
Arg Thr Val Val Asp Gly Leu Lys Gin Lys Gly Ala Val Phe lie Glu 
45 50 55 

gaa eta age gat gtg ccg gtg ggt tec tac ttg att ttc age gcg cac 243 
Glu Leu Ser Asp val Pro Val Gly ser Tyr Leu lie Phe Ser Ala His 
60 65 70 

ggc gta tec aag gag gtg caa cag gaa gec gag gag cgc cag ttg acg 291 
Gly Val Ser Lys Glu val Gin Gin Glu Ala Glu Glu Arg Gin Leu Thr 
75 80 85 90 

gta ttc gat gcg act tgt ccg ctg gtg acc aaa gtg cac atg cag gtt 339 
Val Phe Asp Ala Thr Cys Pro Leu Val Thr Lys val His Met Gin Val 
95 100 105 

gec aag cat gee aaa cag ggc cga gaa gtg att ttg ate ggc cac gec 387 
Ala Lys His Ala Lys Gin Gly Arg Glu Val lie Leu He Gly His Ala 
110 115 120 

ggt cat ccg gaa gtg gaa ggc acg atg ggc cag tat gaa aaa tgc acc 
Gly His Pro Glu Val Glu Gly Thr Met Gly Gin Tyr Glu Lys cys Thr 
125 130 135 

gaa ggc ggc ggc att tat ctg gtc gaa act ccg gaa gac gta cgc aat 483 
Glu Gly Gly Gly lie Tyr Leu val Glu Thr Pro Glu Asp val Arg Asn 
140 145 150 

ttg aaa gtc aac aat ccc aat gat ctg gee tat gtg acg cag acg acc 531 
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Leu Lys val Asn Asn Pro Asn Asp Leu Ala Tyr Val Thr Gin Thr Thr 
155 160 165 170 

ttg teg atg acc gac acc aag gtc atg gtg gat gcg tta cgc gaa caa 579 

Leu ser Met Thr Asp Thr Lys val Met val Asp Ala Leu Arg Glu Gin 
175 180 185 

ttt ccg tec att aag gag caa aaa aag gac gat att tgt tac gcg acg 627 

Phe Pro ser lie Lys Glu Gin Lys Lys Asp Asp lie Cys Tyr Ala Thr 
190 195 200 



caa aac cgt cag gat gcg gtg cat gat ctg gee aag att tec gac ctg 
Gin Asn Arg Gin Asp Ala VaT His Asp Leu Ala Lys lie Ser Asp Leu 
205 210 215 



acc gcg ggc get teg gcg ccg gaa gtg ttg gtg cag gaa gtg ate gat 
Thr Ala Gly Ala ser Ala pro Glu VaT Leu VaT Gin Glu VaT lie Asp 
270 275 280 



675 



att ctg gtt gtc ggc tct ccc aat agt teg aat tec aac cgt ttg cgt 723 
lie Leu Val val GTy Ser pro Asn ser ser Asn ser Asn Arg Leu Arg 
220 225 230 

gaa ate gee gtg caa etc ggt aaa ccc get tat ttg ate gat act tac 771 
Glu lie Ala vaT Gin Leu GTy Lys Pro Ala Tyr Leu lie Asp Thr Tyr 
235 240 245 250 



cag gat ttg aag caa gat tgg ctg gag gga att gaa gta gtc ggg gtt 819 
Gin Asp Leu Lys Gin Asp Trp Leu Glu GTy lie Glu Val val GTy val 
255 260 265 



867 



caa ctg aag gca tgg ggc ggc gaa acc act teg gtc aga gaa aac age 915 
Gin Leu Lys Ala Trp GTy GTy Glu Thr Thr ser Val Arg Glu Asn ser 
285 290 295 

ggc ate gag gaa aag gta gtc ttt teg att ccc aag gag ttg aaa aaa 963 
GTy lie Glu Glu Lys Val Val Phe Ser lie Pro Lys Glu Leu Lys Lys 
300 305 310 

cat atg caa gcg tgatcaa 982 

His Met Gin Ala 

315 

<210> 16 
<211> 318 
<212> PRT 

<213> Methyl omonas 16a 
<400> 16 

Met Gin lie val Leu Ala Asn Pro Arg Gly Phe cys Ala Gly Val Asp 
15 10 15 

Arg Ala lie Glu lie Val Asp Gin Ala lie Glu Ala Phe Gly Ala Pro 
20 25 30 

lie Tyr Val Arg His Glu Val. Val His Asn Arg Thr Val Val Asp Gly 
35 40 45 

Leu Lys Gin Lys Gly Ala Val Phe lie Glu Glu Leu ser Asp Val Pro 
50 55 60 

val Gly Ser Tyr Leu lie Phe ser Ala His Gly Val Ser Lys Glu val 
65 70 75 80 
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Gin Gin Glu Ala Glu Glu Arg Gin Leu Thr val Phe Asp Ala Thr cys 
85 ~ 90 95 

pro Leu Val Thr Lys val His Met Gin val Ala Lys His Ala Lys Gin 
100 105 110 

Gly Arg Glu val lie Leu lie Gly His Ala Gly His Pro Glu val Glu 
115 120 125 

Gly Thr Met Gly Gin Tyr Glu Lys Cys Thr Glu Gly Gly Gly He Tyr 
130 135 140 

Leu Val Glu Thr Pro Glu Asp val Arg Asn Leu Lys Val Asn Asn Pro 
145 150 155 160 

Asn Asp Leu Ala Tyr Val Thr Gin Thr Thr Leu ser Met Thr Asp Thr 
165 170 175 

Lys val Met val Asp Ala Leu Arg Glu Gin Phe Pro. ser He Lys Glu 
180 185 190 

Gin Lys Lys Asp Asp He cys Tyr Ala Thr Gin Asn Arg Gin Asp Ala 
195 200 205 

Val His Asp Leu Ala Lys lie Ser Asp Leu lie Leu val Val Gly ser 
210 215 220 

Pro Asn Ser ser Asn ser Asn Arg Leu Arg Glu lie Ala Val Gin Leu 
225 230 235 240 

Gly Lys Pro Ala Tyr Leu lie Asp Thr Tyr Gin Asp Leu Lys Gin Asp 
245 250 255 

Trp Leu Glu Gly lie Glu val val Gly Val Thr Ala Gly Ala Ser Ala 
260 265 270 

Pro Glu Val Leu val Gin Glu val lie Asp Gin Leu Lys Ala Trp Gly 
275 280 285 

Gly Glu Thr Thr Ser Val Arg Glu Asn Ser Gly lie Glu Glu Lys Val 
290 295 300 

Val Phe ser lie pro Lys Glu Leu Lys Lys His Met Gin Ala 
305 310 315 

<210> 17 
<211> 1254 
<212> DNA 

<213> Methyl omonas 16a 
<220> 
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<221> CDS 

<222> (73).. (1254) 

<400> 17 

ggtggacagc atcattgcgg cggcaccgtt tttctatgcc ggtatcgtgc tgatcggacg 60 

gagcgtattc ga atg aaa ggt att tgc ata ttg ggc get acc ggt teg ate 111 
Met Lys Gly lie Cys lie Leu Gly Ala Thr Gly ser lie 
1 5 10 

ggt gtc age acg ctg gat gtc gtt gee agg cat ccg gat aaa tat caa 159 
Gly val Ser Thr Leu Asp Val val Ala Arg His Pro Asp Lys Tyr Gin 
15 20 25 

gtc gtt gcg ctg acc gee aac ggc aat ate gac gca ttg tat gaa caa 207 
Val Val Ala Leu Thr Ala Asn Gly Asn lie Asp Ala Leu Tyr Glu Gin 
30 35 40 45 

tgc ctg gee cac cat ccg gag tat gcg gtg gtg gtc atg gaa age aag 255 
cys Leu Ala His His Pro Glu Tyr Ala Val Val Val Met Glu ser Lys 
50 55 60 

gta gca gag ttc aaa cag cgc att gec get teg ccg gta gcg gat ate 303 
Val Ala Glu Phe Lys Gin Arg lie Ala Ala ser Pro Val Ala Asp lie 
65 70 75 



aag gtc ttg teg ggt age gag gee ttg caa cag gtg gec acg ctg gaa 
Lys val Leu ser Gly Ser Glu Ala Leu Gin Gin val Ala Thr Leu Glu 
80 85 90 



atg aac aaa ggt etc gaa ctg ate gaa gec tgc ttg ttg ttc aac atg 
Met Asn Lys Gly Leu Glu Leu lie Glu Ala Cys Leu Leu Phe Asn Met 
225 230 235 



351 



aac gtc gat acg gtg atg gcg get ate gtc ggc gcg gec gga ttg ttg 399 
Asn Val Asp Thr vaT Met Ala Ala lie val Gly Ala Ala Gly Leu Leu 
95 100 105 

ccg acc ttg gee gcg gee aag gee ggc aaa acc gtg ctg ttg gee aac 447 
Pro Thr Leu Ala Ala Ala Lys Ala Gly Lys Thr val Leu Leu Ala Asn 
110 115 120 125 

aag gaa gee ttg gtg atg teg gga caa ate ttc atg cag gee gtc age 495 
Lys Glu Ala Leu val Met Ser Gly Gin He Phe Met Gin Ala Val ser 
130 135 140 

gat tec ggc get gtg ttg ctg ccg ata gac age gag cac aac gec ate 543 
Asp Ser Gly Ala Val Leu Leu Pro lie Asp ser Glu His Asn Ala lie 
145 150 155 

ttt cag tgc atg ccg gcg ggt tat acg cca ggc cat aca gec aaa cag 591 
Phe Gin Cys Met Pro Ala Gly Tyr Thr Pro Gly His Thr Ala Lys Gin 
160 165 170 

gcg cgc cgc att tta ttg acc get tec ggt ggc cca. ttt cga egg acg 639 
Ala Arg Arg lie Leu Leu Thr Ala Ser Gly Gly Pro Phe Arg Arg Thr 
175 180 185 

ccg ata gaa acg ttg tec age gtc acg ccg gat cag gec gtt gee cat 687 
Pro He Glu Thr Leu Ser ser val Thr Pro Asp Gin Ala Val Ala His 
190 195 200 205 

cct aaa tgg gac atg ggg cgc aag att teg gtc gat tec gee acc atg 735 
Pro Lys Trp Asp Met Gly Arg Lys lie Ser Val Asp Ser Ala Thr Met 
210 215 220 



783 



gag ccc gac cag att gaa gtc gtc att cat ccg cag age ate att cat 831 

Glu Pro Asp Gin lie Glu Val Val He His Pro Gin Ser lie He His 

240 245 250 
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teg atg gtg gac tat gtc gat ggt teg gtt ttg gcg cag atg ggt aat 879 
Ser Met val Asp Tyr Val Asp Gly ser Val Leu Ala Gin Met Gly Asn 
255 260 265 

ccc gac atg cgc acg ccg ata gcg cac gcg atg gec tgg ccg gaa cgc 927 
Pro Asp Met Arg Thr Pro lie Ala His Ala Met Ala Trp Pro Glu Arg 
270 275 280 285 

ttt gac tct ggt gtg gcg ccg ctg gat att ttc gaa gta ggg cac atg 975 
Phe Asp Ser Giy vaT Ala pro Leu Asp lie Phe Glu Val Gly His Met 
290 295 300 

gat ttc gaa aaa ccc gac ttg aaa egg ttt cct tgt ctg aga ttg get 1023 
Asp Phe Glu Lys Pro Asp Leu Lys Arg Phe Pro cys Leu Arg Leu Ala 
305 310 315 

tat gaa gee ate aag tct ggt gga att atg cca acg gta ttg aac gca 1071 
Tyr Glu Ala lie Lys Ser Gly Gly lie Met Pro Thr Val Leu Asn Ala 
320 325 330 

gee aat gaa att get gtc gaa gcg ttt tta aat gaa gaa gtc aaa ttc 1119 
Ala Asn Glu lie Ala Val Glu Ala Phe Leu Asn Glu Glu Val Lys Phe 
335 340 345 

act gac ate gcg gtc ate ate gag cgc age atg gee cag ttt aaa ccg 1167 
Thr Asp lie Ala Val lie lie Glu Arg Ser Met Ala Gin Phe Lys Pro 
350 355 ~ 360 365 

gac gat gee ggc age etc gaa ttg gtt ttg cag gee gat caa gat gcg 1215 
Asp Asp Ala Gly ser Leu Glu Leu Val Leu Gin Ala Asp Gin Asp Ala 
370 375 380 



cgc gag gtg get aga gac ate ate aag acc ttg gta get 
Arg Glu Val Ala Arg Asp lie lie Lys Thr Leu val Ala 
385 390 

<210> 18 
<211> 394 
<212> PRT 

<213> Methyl omonas 16a 
<400> 18 

Met Lys Gly lie Cys lie Leu Gly Ala Thr Gly ser lie Gly Val Ser 
1 5 10 15 

Thr Leu Asp val Val Ala Arg His Pro Asp Lys Tyr Gin Val Val Ala 
20 25 30 

Leu Thr Ala Asn Gly Asn lie Asp Ala Leu Tyr Glu Gin Cys Leu Ala 
35 40 45 

His His Pro Glu Tyr Ala val val Val Met Glu ser Lys val Ala Glu 
50 55 60 

Phe Lys Gin Arg lie Ala Ala Ser Pro Val Ala Asp lie Lys Val Leu 
65 70 75 80 

ser Gly ser Glu Ala Leu Gin Gin Val Ala Thr Leu Glu Asn val Asp 
85 90 95 



1254 
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Thr Val Met Ala Ala lie Val Gly Ala Ala Gly Leu Leu Pro Thr Leu 
100 105 110 

Ala Ala Ala Lys Ala Gly Lys Thr val Leu Leu Ala Asn Lys Glu Ala 
115 120 125 

Leu Val Met ser Gly Gin He Phe Met Gin Ala Val Ser Asp Ser Gly 
130 135 140 

Ala Val Leu Leu pro lie Asp Ser Glu His Asn Ala lie Phe Gin Cys 
145 150 155 160 

Met Pro Ala Gly Tyr Thr Pro Gly His Thr Ala Lys Gin Ala Arg Arg 
165 170 175 

lie Leu Leu Thr Ala Ser Gly Gly Pro Phe Arg Arg Thr Pro lie Glu 
180 185 190 

Thr Leu Ser ser val Thr Pro Asp Gin Ala Val Ala His Pro Lys Trp 
195 200 205 

Asp Met Gly Arg Lys lie ser val Asp Ser Ala Thr Met Met Asn Lys 
210 ~ 215 220 

Gly Leu Glu Leu lie Glu Ala Cys Leu Leu Phe Asn Met Glu Pro Asp 
225 230 235 240 

Gin lie Glu val Val lie His Pro Gin Ser lie lie His Ser Met val 
245 250 255 

Asp Tyr Val Asp Gly Ser val Leu Ala Gin Met Gly Asn Pro Asp Met 
260 265 270 

Arg Thr Pro He Ala His Ala Met Ala Trp Pro Glu Arg Phe Asp ser 
275 280 285 

Gly val Ala Pro Leu Asp lie Phe Glu val Gly His Met Asp Phe Glu 
290 295 300 

Lys Pro Asp Leu Lys Arg Phe Pro cys Leu Arg Leu Ala Tyr Glu Ala 
305 310 315 320 

lie Lys Ser Gly Gly lie Met Pro Thr Val Leu Asn Ala Ala Asn Glu 
325 330 335 

lie Ala Val Glu Ala Phe Leu Asn Glu Glu Val Lys Phe Thr Asp lie 
340 345 350 

Ala val lie lie Glu Arg ser Met Ala Gin Phe Lys Pro Asp Asp Ala 
355 360 365 

Gly Ser Leu Glu Leu val Leu Gin Ala Asp Gin Asp Ala Arg Glu Val 
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370 375 380 

Ala Arg Asp lie lie Lys Thr Leu val Ala 
385 390 

<210> 19 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #1 for amplification of crt gene cluster 

<400> 19 

atgacggtct gcgcaaaaaa acacg 25 

<210> 20 

<211> 28 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer #2 for amplification of crt gene cluster 

<400> 20 

gagaaattat gttgtggatt tggaatgc 28 

<210> 21 

<211> 61 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5 f kan(dxs) 

<400> 21 

tggaagcgct agcggactac atcatccagc gtaataaata acgtcttgag cgattgtgta 60 
9 61 

<210> 22 

<211> 65 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'kan(idi) 

<400> 22 

tctgatgcgc aagctgaaga aaaatgagca tggagaataa tatgacgtct tgagcgattg 60 
tgtag 65 

<210> 23 
<211> 65 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'kan(ygbBP) 
<400> 23 

gacgcgtcga agcgcgcaca gtctgcgggg caaaacaatc gataacgtct tgagcgattg 60 
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tgtag 65 

<210> 24 

<211> 60 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5 'kanCispAdxs) 

<400> 24 

accatgacgg ggcgaaaaat attgagagtc agacattcat gtgtaggctg gagctgcttc 60 

<210> 25 

<211> 64 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3'kan 

<400> 25 

gaagacgaaa gggcctcgtg atacgcctat ttttataggt tatatga&ta tcctccttag 60 
ttcc 64 

<210> 26 

<211> 50 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'-T5 

<400> 26 

ctaaggagga tattcatata acctataaaa ataggcgtat cacgaggccc 50 

<210> 27 
<211> 70 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3'-T5(dxs) 
<400> 27 

ggagtcgacc agtgocaggg tcgggtattt ggcaatatca aaactcatag ttaatttctc 60 
ctctttaatg 70 

<210> 28 
<211> 68 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> primer 3'-T5Cidi) 
<400> 28 

tgggaactcc ctgtgcattc aataaaatga cgtgttccgt ttgcatagtt aatttctcct 60 
ctttaatg 68 
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<210> 29 
<211> 68 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3'-T5(ygbBP) 
<400> 29 

cggccgccgg aaccacggcg caaacatcca aatgagtggt tgccatagtt aatttctcct 60 
ctttaatg 68 

<210> 30 

<211> 62 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3 1 -T5(ispAdxs) 

<400> 30 

cctgcttaac gcaggcttcg agttgctgcg gaaagtccat agttaatttc tcctctttaa 60 
tg 62 

<210> 31 

<211> 65 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5 1 -kanT5(ispB) 

<400> 31 

accataaacc ctaagttgcc tttgttcaca gtaaggtaat cggggcgtct tgagcgattg 60 
tgtag 65 

<210> 32 
<211> 67 
<212v DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3'-kanT5(ispB) 
<400> 32 

cgccatatct tgcgcggtta actcattgat tttttctaaa ttcatagtta atttctcctc 60 
tttaatg 67 

<210> 33 

<211> 156 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> phage T5 promoter sequence 

<400> 33 

ctataaaaat aggcgtatca cgaggccctt tcgtcttcac ctcgagaaat cataaaaaat 60 

ttatttgctt tgtgagcgga taacaattat aatagattca attgtgagcg gataacaatt 120 
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tcacacagaa ttcattaaag aggagaaatt aactca 156 

<210> 34 

<211> 65 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'-kanT5(dxsl6a) 

<400> 34 

cactaacgcc cgcacattgc tgcgggcttt ttgattcatt tcgcacgtct tgagcgattg 60 
tgtag 65 

<210> 35 

<211> 65 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5 ' -kanT5(dxrl6a) 

<400> 35 

taaagggcta agagtagtgt gctcttagcc cttaattacg tttcccgtct tgagcgattg 60 
tgtag 65 

<210> 36 

<211> 65 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'-kanT5(lytB16a) 

<400> 36 

ctacaactgg cgagatgcat agcgagtata atttgtattt tgcgtcgtct tgagcgattg 60 
tgtag 65 

<210> 37 

<211> 51 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3 , -kanT5(dxsl6a) 

<400> 37 

agtagaggga agtctttgga aagagccata gttaatttct cctctttaat g 51 

<210> 38 
<211> 51 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3' -kanT5(dxrl6a) 
<400> 38 

acggtgccgc cgcaatgatg ctgtccacca gttaatttct cctctttaat g 51 
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<210> 39 

<211> 51 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3'-kanT5(lytB16a) 

<400> 39 

ccacgggggt ttgcgagtac gatttgcata gttaatttct cctctttaat g 51 



<210> 40 

<211> 55 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'-(dxsl6a) 

<400> 40 

acagaattca ttaaagagga gaaattaact atggctcttt ccaaagactt ccctc 55 



<210> 41 

<211> 55 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'-(dxrl6a) 

<400> 41 

acagaattca ttaaagagga gaaattaact ggtggacagc atcattgcgg cggca 55 



<210> 42 

<211> 55 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 5'-(lytBl6a) 

<400> 42 

acagaattca ttaaagagga gaaattaact atgcaaatcg tactcgcaaa ccccc 55 



<210> 43 
<211> 68 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer 3'-(dxsl6a) 
<400> 43 

aggagcgaag tgattatcag tatgctgttc atatagcctc gaattatcaa gcgcaaaact 60 
gttcgatg 68 



<210> 44 
<211> 67 
<212> DNA 

<213> Artificial sequence 
<220> 
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<223> Primer S'-CdxrlGa) 
<400> 44 

ggcattttca ctctggcaat gcgcataaac gctttcaaag tcctgttaag ctaccaaggt 



60 



cttgatg 



67 



<210> 45 
<211> 68 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B'-ClytBlGa) 
<400> 45 

agtggcggac gggcaaacaa gggtaacata ggatcaatga gggttattga tcacgcttgc 60 
atatgttt 68 



<210> 46 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer T-kan 

<400> 46 

accggatatc accacttatc tgctc 25 



<210> 47 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B-ispA 



<210> 48 

<2U> 32 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer T-T5 

<400> 48 



<210> 49 

<211> 25 

<212> DNA i 

<213> Artificial sequence 

<220> 

<223> Primer B-idi 

<400> 49 

tcatgctgac ctggtgaagg aatcc 25 



<400> 47 

cctaataatg cgccatactg catgg 



25 



taacctataa aaataggcgt atcacgaggc cc 



32 
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<210> 50 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B-dxs(16a) 

<400> 50 

gcgatattgt atgtctgatt cagga 25 

<210> 51 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B-lytB(16a) 

<400> 51 

tccactggat gcgggaagct ggcag 25 

<210> 52 

<211> 26 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B-dxs 

<400> 52 

tggcaacagt cgtagctcct gggtgg 26 

<210> 53 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B-ygb 

<400> 53 

ccagcagcgc atgcaccgag tgttc 25 

<210> 54 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer Tn5PCRF 

<400> 54 

gctgagttga aggatcagat c 21 

<210> 55 

<211> 21 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer TnSPCRR 
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<400> 55 

cgagcaagac gtttcccgtt g 

<210> 56 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer Kan-2 FP-1 

<400> 56 - c 
acctacaaca aagctctcat caacc & 

<210> 57 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer Kan-2 RP-1 

<400> 57 

gcaatgtaac atcagagatt ttgag 25 

<210> 58 

<211> 20 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer Y15„F 

<400> 58 

ggatcgatct tgagatgacc 20 

<210> 59 

<211> 24 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer Yl5_R 

<400> 59 

gctttcgtaa ttttcgcatt tctg 24 

<210> 60 

<211> 25 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer T-Tn5yjeR 

<400> 60 

gcaatgtaac atcagagatt ttgag 25 

<210> 61 
<211> 24 
<212> DNA 

<213> Artificial sequence 
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<220> 

<223> Primer B-yjeR 

<400> 61 

gctttcgtaa ttttcgcatt tctg 24 

<210> 62 

<211> 26 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Primer B-ispB 

<400> 62 

agtacagcaa tcatcggacg aatacg 26 

<210> 63 

<211> 1845 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Sequence yjeR: :Tn5 mutant gene (transposon disrupted yjeR) 



<400> 63 
atgggcaaaa 


catctatgat acacgcaatt 


gtggatcaat 


atagtcactg 


tgaatgggtg 


60 


gaaaatagca 


tgagtgccaa tgaaaacaac 


ctgatttgga 


tcgatcttga 


gatgaccggt 


120 


ctggatcccg 


agcgcgatcg cattattgag, 


attgccacgc 


tggtgaccga 


tgccaacctg 


180 


aatattctgg 


cagaagggcc gaccattgca 


gtacaccagt 


ctgatgaaca 


gctggcgctg 


240 


atggatgact 


ggaacgtgcg cacccatacc 


gccagcgggc 


tggtagagcg 


cgtgaaagcg 


300 


agcacgatgg 


gcgatcggga agctgaactg 


gcaacgctcg 


aatttttaaa 


acagtgggtg 


360 


cctgcgggaa 


aatcgccgat ttgcggtaac 


agcatcggtc 


aggaccgtcg 


tttcctgttt 


420 


aaatacatgc 


cggagctgga agcctacttc 


cactaccgtt 


atctcgatgt 


cagcaccctg 


480 


aaagagctgg 


cgcgccgctg gaagccggaa 


attctggatg 


gttttaccaa 


gcaggggacg 


540 


catcaggcga 


tggatgatat ccgtgaatcg 


gtggcggagc 


tggcttacta 


cctgtctctt 


600 


atacacatct 


caaccctgaa gcttgcatgc 


ctgcaggtcg 


actctagagg 


atccccgcca 


660 


cggttgatga 


gagctttgtt gtaggtggac 


cagttggtga 


ttttgaactt 


ttgctttgcc 


720 


acggaacggt 


ctgcgttgtc gggaagatgc 


gtgatctgat 


ccttcaactc 


agcaaaagtt 


780 


cgatttattc 


aacaaagccg ccgtcccgtc 


aagtcagcgt 


aatgctctgc 


cagtgttaca 


840 


accaattaac 


caattctgat tagaaaaact 


catcgagcat 


caaatgaaac 


tgcaatttat 


900 


tcatatcagg 


attatcaata ccatattttt 


gaaaaagccg 


tttctgtaat 


gaaggagaaa 


960 


actcaccgag 


gcagttccat aggatggcaa 


gatcctggta 


tcggtctgcg 


attccgactc 


1020 


gtccaacatc 


aatacaacct attaatttcc 


cctcgtcaaa 


aataaggtta 


tcaagtgaga 


1080 


aatcaccatg 


agtgacgact gaatccggtg 


agaatggcaa 


aagtttatgc 


atttctttcc 


1140 


agacttgttc 


aacaggccag ccattacgct 


cgtcatcaaa 


atcactcgca 


tcaaccaaac 


1200 


cgttattcat 


tcgtgattgc gcctgagcga 


gacgaaatac gcgatcgctg 
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aattacaaac 


aggaatcgaa 


tgcaaccggc 


gcaggaacac 


tgccagcgca tcaacaatat 


1320 


tttcacctga 


atcaggatat 


tcttctaata 


cctggaatgc 


tgtttttccg gggatcgcag 


1380 


tggtgagtaa 


ccatgcatca 


tcaggagtac 


ggataaaatg 


cttgatggtc ggaagaggca 


1440 


taaattccgt 


cagccagttt 


agtctgacca 


tctcatctgt 


aacatcattg gcaacgctac 


1500 


ctttgccatg 


tttcagaaac 


aactctggcg 


catcgggctt 


cccatacaat cgatagattg 


1560 


tcgcacctga 


ttgcccgaca 


ttatcgcgag 


cccatttata 


cccatataaa tcagcatcca 


1620 


tgttggaatt 


taatcgcggc 


ctcgagcaag 


acgtttcccg 


ttgaatatgg ctcataacac 


1680 


cccttgtatt 


actgtttatg 


taagcagaca 


gttttattgt 


tcatgatgat atatttttat 


1740 


cttgtgcaat 


gtaacatcag 


agattttgag 


acacaattca 


tcgatgatgg ttgagatgtg 


1800 


tataagagac 


aggcttacta 


ccgcgagcat 


tttatcaagc 


tgtaa 


1845 



<210> 64 

<211> 8609 

<212> DNA 

<213> Artificial sequence 
<220> 

<223> Plasmid pPCBlS 

<400> 64 



cgtatggcaa 


tgaaagacgg 


tgagctggtg 


atatgggata 


gtgttcaccc ttgttacacc 


60 


gttttccatg 


agcaaactga 


aacgttttca 


tcgctctgga 


gtgaatacca cgacgatttc 


120 


cggcagtttc 


tacacatata 


ttcgcaagat 


gtggcgtgtt 


acggtgaaaa cctggcctat 


180 


ttccctaaag 


ggtttattga 


gaatatgttt 


ttcgtctcag 


ccaatccctg ggtgagtttc 


240 


accagttttg 


atttaaacgt 


ggccaatatg 


gacaacttct 


tcgcccccgt tttcaccatg 


300 


ggcaaatatt 


atacgcaagg 


cgacaaggtg 


ctgatgccgc 


tggcgattca ggttcatcat 


360 


gccgtctgtg 


atggcttcca 


tgtcggcaga 


atgcttaatg 


aattacaaca gtactgcgat 


420 


gagtggcagg 


gcggggcgta 


atttttttaa 


ggcagttatt 


ggtgcctaga aatattttat 


480 


ctgattaata 


agatgatctt 


cttgagatcg 


ttttggtctg 


cgcgtaatct cttgctctga 


540 


aaacgaaaaa 


accgccttgc 


agggcggttt 


ttcgaaggtt 


ctctgagcta ccaactcttt 


600 


gaaccgaggt 


aactggcttg 


gaggagcgca 


gtcaccaaaa 


cttgtccttt cagtttagcc 


660 


ttaaccggcg 


catgacttca 


agactaactc 


ctctaaatca 


attaccagtg gctgctgcca 


720 


gtggtgcttt 


tgcatgtctt 


tccgggttgg 


actcaagacg 


atagttaccg gataaggcgc 


780 


agcggtcgga 


ctgaacgggg 


ggttcgtgca 


tacagtccag 


cttggagcga actgcctacc 


840 


cggaactgag 


tgtcaggcgt 


ggaatgagac 


aaacgcggcc 


ataacagcgg aatgacaccg 


900 


gtaaaccgaa 


aggcaggaac 


aggagagcgc 


acgagggagc 


cgccagggga aacgcctggt 


960 


atctttatag 


tcctgtcggg 


tttcgccacc 


actgatttga 


gcgtcagatt tcgtgatgct 


1020 


tgtcaggggg 


gcggagccta 


tggaaaaacg 


gctttgccgc 


ggccctctca cttccctgtt 


1080 


aagtatcttc 


ctggcatctt 


ccaggaaatc 


tccgccccgt 


tcgtaagcca tttccgctcg 


1140 
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ccgcagtcga 


acgaccgagc 


gtagcgagtc 


agtgagcgag 


gaagcggaat 


atatcctgta 


1200 


tcacatattc 


tgctgacgca 


ccggtgcagc 


cttttttctc 


ctgccacatg 


aagcacttca 


1260 


ctgacaccct 


catcagtgcc 


aacatagtaa 


gccagtatat 


acactccgct 


agcgcccaat 


1320 


acgcaaaccg 


cctctccccg 


cgcgttggcc 


gattcattaa 


tgcagctggc 


acgacaggtt 


1380 


tcccgactgg 


aaagcgggca 


gtgagcgcaa 


cgcaattaat 


gtgagttagc 


tcactcatta 


1440 


ggcaccccag 


gctttacact 


ttatgcttcc 


ggctcgtatg 


ttgtgtggaa 


ttgtgagcgg 


1500 


ataacaattt 


cacacaggaa 


acagctatga 


ccatgattac 


gaattcgagc 


tcggtaccca 


1560 


aacgaattcg 


cccttttgac 


ggtctgcgca 


aaaaaacacg 


ttcaccttac 


tggcatttcg 


1620 


gctgagcagt 


tgctggctga 


tatcgatagc 


cgccttgatc 


agttactgcc 


ggttcagggt 


1680 


gagcgggatt 


Qtgtgggtgc 


cgcgatgcgt 


gaaggcacgc 


tggcaccggg 


caaacgtatt 


1740 


cgtccgatgc 


tgctgttatt 


aacagcgcgc 


gatcttggct 


gtgcgatcag 


tcacggggga 


1800 


ttactggatt 


tagcctgcgc 


ggttgaaatg 


gtgcatgctg 


cctcgctgat 


tctggatgat 


1860 


atgccctgca 


tggacgatgc 


gcagatgcgt 


cgggggcgtc 


ccaccattca 


cacgcagtac 


1920 


ggtgaacatg 


tggcgattct 


ggcggcggtc 


gctttactca 


gcaaagcgtt 


tggggtgatt 


1980 


gccgaggctg 


aaggtctgac 


gccgatagcc 


aaaactcgcg 


cggtgtcgga 


gctgtccact 


2040 


gcgattggca 


tgcagggtct 


ggttcagggc 


cagtttaagg 


acctctcgga 


aggcgataaa 


2100 


ccccgcagcg 


ccgatgccat 


actgctaacc 


aatcagttta 


aaaccagcac 


gctgttttgc 


2160 


gcgtcaacgc 


aaatggcgtc 


cattgcggcc 


aacgcgtcct 


gcgaagcgcg 


tgagaacctg 


2220 


catcgtttct 


cgctcgatct 


cggccaggcc 


tttcagttgc 


ttgacgatct 


taccgatggc 


2280 


atgaccgata 


ccggcaaaga 


catcaatcag 


gatgcaggta 


aatcaacgct 


ggtcaattta 


2340 


ttaggctcag 


gcgcggtcga 


agaacgcctg 


cgacagcatt 


tgcgcctggc 


cagtgaacac 


2400 


ctttccgcgg 


catgccaaaa 


cggccattcc 


accacccaac 


tttttattca 


ggcctggttt 


2460 


gacaaaaaac 


tcgctgccgt 


cagttaagga 


tgctgcatga 


gccattttgc 


ggtgatcgca 


2520 


ccgccctttt 


tcagccatgt 


tcgcgctctg 


caaaaccttg 


ctcaggaatt 


agtggcccgc 


2580 


ggtcatcgtg 


ttacgttttt 


tcagcaacat 


gactgcaaag 


cgctggtaac 


gggcagcgat 


2640 


atcggattcc 


agaccgtcgg 


actgcaaacg 


catcctcccg 


gttccttatc 


gcacctgctg 


2700 


cacctggccg 


cgcacccact 


cggaccctcg 


atgttacgac 


tgatcaatga 


aatggcacgt 


2760 


accagcgata 


tgctttgccg 


ggaactgccc 


gccgcttttc 


atgcgttgca 


gatagagggc 


2820 


gtgatcgttg 


atcaaatgga 


gccggcaggt 


gcagtagtcg 


cagaagcgtc 


aggtctgccg 


2880 


tttgtttcgg 


tggcctgcgc 


gctgccgctc 


aaccgcgaac 


cgggtttgcc 


tctggcggtg 


2940 


atgcctttcg 


agtacggcac 


cagcgatgcg 


gctcgggaac 


gctataccac 


cagcgaaaaa 


3000 


atttatgact 


ggctgatgcg 


acgtcacgat 


cgtgtgatcg 


cgcatcatgc 


atgcagaatg 


3060 


ggtttagccc 


cgcgtgaaaa 


actgcatcat 


tgtttttctc 


cactggcaca 


aatcagccag 


3120 


ttgatccccg 


aactggattt 


tccccgcaaa 


gcgctgccag 


actgctttca 


tgcggttgga 


3180 


ccgttacggc 


aaccccaggg 


gacgccgggg 


tcatcaactt 


cttattttcc 


gtccccggac 


3240 
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aaaccccgta 


tttttgcctc 


gctgggcacc 


ctgcagggac 


atcgttatgg 


cctgttcagg 


3300 


accatcgcca 


aagcctgcga 


agaggtggat 


gcgcagttac 


tgttggcaca 


ctgtggcggc 


3360 


ctctcagcca 


cgcaggcagg 


tgaactggcc 


cggggcgggg 


acattcaggt 


tgtggatttt 


3420 


gccgatcaat 


ccgcagcact 


ttcacaggca 


cagttgacaa 


tcacacatgg 


tgggatgaat 


3480 


acggtactgg 


acgctattgc 


ttcccgcaca 


ccgctactgg 


cgctgccgct 


ggcatttgat 


3540 


caacctggcg 


tggcatcacg 


aattgtttat 


catggcatcg 


gcaagcgtgc 


gtctcggttt 


3600 


actaccagcc 


atgcgctggc 


gcggcagatt 


cgatcgctgc 


tgactaacac 


cgattacccg 


3660 


cagcgtatga 


caaaaattca 


ggccgcattg 


cgtctggcag 


gcggcacacc 


agccgccgcc 


3720 


gatattgttg 


aacaggcgat 


gcggacctgt 


cagccagtac 


tcagtgggca 


ggattatgca 


3780 


accgcactat 


gatctcattc 


tggtcggtgc 


cggtctggct 


aatggcctta 


tcgcgctccg 


3840 


gcttcagcaa 


cagcatccgg 


atatgcggat 


cttgcttatt 


gaggcgggtc 


ctgaggcggg 


3900 


agggaaccat 


acctggtcct 


ttcacgaaga 


ggatttaacg 


ctgaatcagc 


atcgctggat 


3960 


agcgccgctt 


gtggtccatc 


actggcccga 


ctaccaggtt 


cgtttccccc 


aacgccgtcg 


4020 


ccatgtgaac 


agtggctact 


actgcgtgac 


ctcccggcat 


ttcgccggga 


tactccggca 


4080 


acagtttgga 


caacatttat 


ggctgcatac 


cgcggtttca 


gccgttcatg 


ctgaatcggt 


4140 


ccagttagcg 


gatggccgga 


ttattcatgc 


cagtacagtg 


atcgacggac 


ggggttacac 


4200 


gcctgattct 


gcactacgcg 


taggattcca 


ggcatttatc 


ggtcaggagt 


ggcaactgag 


4260 


cgcgccgcat 


ggtttatcgt 


caccgattat 


catggatgcg 


acggtcgatc 


agcaaaatgg 


4320 


ctaccgcttt 


gtttataccc 


tgccgctttc 


cgcaaccgca 


ctgctgatcg 


aagacacaca 


4380 


ctacattgac 


aaggctaatc 


ttcaggccga 


acgggcgcgt 


cagaacattc 


gcgattatgc 


4440 


tgcgcgacag 


ggttggccgt 


tacagacgtt 


gctgcgggaa 


gaacagggtg 


cattgcccat 


4500 


tacgttaacg 


ggcgataatc 


gtcagttttg 


gcaacagcaa 


ccgcaagcct 


gtagcggatt 


4560 


acgcgccggg 


ctgtttcatc 


cgacaaccgg 


ctactcccta 


ccgctcgcgg 


tggcgctggc 


4620 


cgatcgtctc 


agcgcgctgg 


atgtgtttac 


ctcttcctct 


gttcaccaga 


cgattgctca 


4680 


ctttgcccag 


caacgttggc 


agcaacaggg 


gtttttccgc 


atgctgaatc 


gcatgttgtt 


4740 


tttagccgga 


ccggccgagt 


cacgctggcg 


tgtgatgcag 


cgtttctatg 


gcttacccga 


4800 


ggatttgatt 


gcccgctttt 


atgcgggaaa 


actcaccgtg 


accgatcggc 


tacgcattct 


4860 


gagcggcaag 


ccgcccgttc 


ccgttttcgc 


ggcattgcag 


gcaattatga 


cgactcatcg 


4920 


ttgaagagcg 


actacatgaa 


accaactacg 


gtaattggtg 


cgggctttgg 


tggcctggca 


4980 


ctggcaattc 


gtttacaggc 


cgcaggtatt 


cctgttttgc 


tgcttgagca 


gcgcgacaag 


5040 


ccgggtggcc 


gggcttatgt 


ttatcaggag 


cagggcttta 


cttttgatgc 


aggccctacc 


5100 


gttatcaccg 


atcccagcgc 


gattgaagaa 


ctgtttgctc 


tggccggtaa 


acagcttaag 


5160 


gattacgtcg 


agctgttgcc 


ggtcacgccg 


ttttatcgcc 


tgtgctggga 


gtccggcaag 


5220 


gtcttcaatt 


acgataacga 


ccaggcccag 


ttagaagcgc 


agatacagca 


gtttaatccg 


5280 
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cgcgatgttg cgggttatcg agcgttcctt gactattcgc gtgccgtatt caatgagggc 5340 

tatctgaagc tcggcactgt gcctttttta tcgttcaaag acatgcttcg ggccgcgccc 5400 

cagttggcaa agctgcaggc atggcgcagc gtttacagta aagttgccgg ctacattgag 5460 

gatgagcatc ttcggcaggc gttttctttt cactcgctct tagtgggggg gaatccgttt 5520 

gcaacctcgt ccatttatac gctgattcac gcgttagaac gggaatgggg cgtctggttt 5580 

ccacgcggtg gaaccggtgc gctggtcaat ggcatgatca agctgtttca ggatctgggc 5640 

ggcgaagtcg tgcttaacgc ccgggtcagt catatggaaa ccgttgggga caagattcag 5700 

gccgtgcagt tggaagacgg cagacggttt gaaacctgcg cggtggcgtc gaacgctgat 5760 

gttgtacata cctatcgcga tctgctgtct cagcatcccg cagccgctaa gcaggcgaaa 5820 

aaactgcaat ccaagcgtat gagtaactca ctgtttgtac tctattttgg tctcaaccat 5880 

catcacgatc aactcgccca tcataccgtc tgttttgggc cacgctaccg tgaactgatt 5940 

cacgaaattt ttaaccatga tggtctggct gaggattttt cgctttattt acacgcacct 6000 

tgtgtcacgg atccgtcact ggcaccggaa gggtgcggca gctattatgt gctggcgcct 6060 

gttccacact taggcacggc gaacctcgac tgggcggtag aaggaccccg actgcgcgat 6120 

cgtatttttg actaccttga gcaacattac atgcctggct tgcgaagcca gttggtgacg 6180 

caccgtatgt ttacgccgtt cgatttccgc gacgagctca atgcctggca aggttcggcc 6240 

ttctcggttg aacctattct gacccagagc gcctggttcc gaccacataa ccgcgataag 6300 

cacattgata atctttatct ggttggcgca ggcacccatc ctggcgcggg cattcccggc 6360 

gtaatcggct cggcgaaggc gacggcaggc ttaatgctgg aggacctgat ttgacgaata 6420 

cgtcattact gaatcatgcc gtcgaaacca tggcggttgg ctcgaaaagc tttgcgactg 6480 

catcgacgct tttcgacgcc aaaacccgtc gcagcgtgct gatgctttac gcatggtgcc 6540 

gccactgcga cgacgtcatt gacgatcaaa cactgggctt tcatgccgac cagccctctt 6600 

cgcagatgcc tgagcagcgc ctgcagcagc ttgaaatgaa aacgcgtcag gcctacgccg 6660 

gttcgcaaat gcacgagccc gcttttgccg cgtttcagga ggtcgcgatg gcgcatgata 6720 

tcgctcccgc ctacgcgttc gaccatctgg aaggttttgc catggatgtg cgcgaaacgc 6780 

gctacctgac actggacgat acgctgcgtt attgctatca cgtcgccggt gttgtgggcc 6840 

tgatgatggc gcaaattatg ggcgttcgcg ataacgccac gctcgatcgc gcctgcgatc ' 6900 

tcgggctggc tttccagttg accaacattg cgcgtgatat tgtcgacgat gctcaggtgg 6960 

gccgctgtta tctgcctgaa agctggctgg aagaggaagg actgacgaaa gcgaattatg 7020 

ctgcgccaga aaaccggcag gccttaagcc gtatcgccgg gcgactggta cgggaagcgg 7080 

aaccctatta cgtatcatca atggccggtc tggcacaatt acccttacgc tcggcctggg 7140 

ccatcgcgac agcgaagcag gtgtaccgta aaattggcgt gaaagttgaa caggccggta 7200 

agcaggcctg ggatcatcgc cagtccacgt ccaccgccga aaaattaacg cttttgctga 7260 

cggcatccgg tcaggcagtt acttcccgga tgaagacgta tccaccccgt cctgctcatc 7320 

tctggcagcg cccgatctag ccgcatgcct ttctctcagc gtcgcctgaa gtttagataa 7380 

Page 40 



WO 2004/056975 



PCT/US2003/041812 



cggtggcgcg tacagaaaac caaaggacac gcagccctct tttcccctta cagcatgatg 7440 

catacggtgg gccatgtata accgtttcag gtagcctttg cgcggtatgt agcggaacgg 7500 

ccagcgctgg tgtaccagtc cgtcgtggac cataaaatac agtaaaccat aagcggtcat 7560 

gcctgcacca atccactgga gcggccagat tcctgtactg ccgaagtaaa tcagggcaat 7620 

cgacacaatg gcgaatacca cggcatagag atcgttaact tcaaatgcgc ctttacgcgg 7680 

ttcatgatgt gaaagatgcc agccccaacc ccagccgtgc atgatgtatt tatgtgccag 7740 

tgcagcaacc acttccatgc cgaccacggt gacaaacacg atcagggcat tccaaatcca 7800 

caacataatt tctcaagggc gaattcgcgg ggatcctcta gagtcgacct gcaggcatgc 7860 

aagcttggca ctggccgtcg ttttacaacg tcgtgactgg gaaaaccctg gcgttaccca 7920 

acttaatcgc cttgcagcac atcccccttt cgccagctgg cgtaatagcg aagaggcccg 7980 

caccgatcgc ccttcccaac agttgcgcag cctgaatggc gaatggcgct gatgtccggc 8040 

ggtgcttttg ccgttacgca ccaccccgtc agtagctgaa caggagggac agctgataga 8100 

aacagaagcc actggagcac ctcaaaaaca ccatcataca ctaaatcagt aagttggcag 8160 

catcacccga cgcactttgc gccgaataaa tacctgtgac ggaagatcac ttcgcagaat 8220 

aaataaatcc tggtgtccct gttgataccg ggaagccctg ggccaacttt tggcgaaaat 8280 

gagacgttga tcggcacgta agaggttcca actttcacca taatgaaata agatcactac 8340 

cgggcgtatt ttttgagtta tcgagatttt caggagctaa ggaagctaaa atggagaaaa 8400 

aaatcactgg atataccacc gttgatatat cccaatggca tcgtaaagaa cattttgagg 8460 

catttcagtc agttgctcaa tgtacctata accagaccgt tcagctggat attacggcct 8520 

ttttaaagac cgtaaagaaa aataagcaca agttttatcc ggcctttatt cacattcttg 8580 

cccgcctgat gaatgctcat ccggaattt 8609 

<210> 65 
<211> 6329 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Plasmid pKD46 
<400> 65 

catcgattta ttatgacaac ttgacggcta catcattcac tttttcttca caaccggcac 60 

ggaactcgct cgggctggcc ccggtgcatt ttttaaatac ccgcgagaaa tagagttgat 120 

cgtcaaaacc aacattgcga ccgacggtgg cgataggcat ccgggtggtg ctcaaaagca 180 

gcttcgcctg gctgatacgt tggtcctcgc gccagcttaa gacgctaatc cctaactgct 240 

ggcggaaaag atgtgacaga cgcgacggcg acaagcaaac atgctgtgcg acgctggcga 300 

tatcaaaatt gctgtctgcc aggtgatcgc tgatgtactg acaagcctcg cgtacccgat 360 

tatccatcgg tggatggagc gactcgttaa tcgcttccat gcgccgcagt aacaattgct 420 

caagcagatt tatcgccagc agctccgaat agcgcccttc cccttgcccg gcgttaatga 480 
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tttgcccaaa caggtcgctg aaatgcggct ggtgcgcttc atccgggcga aagaaccccg 540 

tattggcaaa tattgacggc cagttaagcc attcatgcca gtaggcgcgc ggacgaaagt 600 

aaacccactg gtgataccat tcgcgagcct ccggatgacg accgtagtga tgaatctctc 660 

ctggcgggaa cagcaaaata tcacccggtc ggcaaacaaa ttctcgtccc tgatttttca 720 

ccaccccctg accgcgaatg gtgagattga gaatataacc tttcattccc agcggtcggt 780 

cgataaaaaa atcgagataa ccgttggcct caatcggcgt taaacccgcc accagatggg 840 

cattaaacga gtatcccggc agcaggggat cattttgcgc ttcagccata cttttcatac 900 

tcccgccatt cagagaagaa accaattgtc catattgcat cagacattgc cgtcactgcg 960 

tcttttactg gctcttctcg ctaaccaaac cggtaacccc gcttattaaa agcattctgt 1020 

aacaaagcgg gaccaaagcc atgacaaaaa cgcgtaacaa aagtgtctat aatcacggca 1080 

gaaaagtcca cattgattat ttgcacggcg tcacactttg ctatgccata gcatttttat 1140 

ccataagatt agcggatcct acctgacgct ttttatcgca actctctact gtttctccat 1200 

acccgttttt ttgggaattc gagctctaag gaggttataa aaaatggata ttaatactga 1260 

aactgagatc aagcaaaagc attcactaac cccctttcct gttttcctaa tcagcccggc 1320 

atttcgcggg cgatattttc acagctattt caggagttca gccatgaacg cttattacat 1380 

tcaggatcgt cttgaggctc agagctgggc gcgtcactac cagcagctcg cccgtgaaga 1440 

gaaagaggca gaactggcag acgacatgga aaaaggcctg ccccagcacc tgtttgaatc 1500 

gctatgcatc gatcatttgc aacgccacgg ggccagcaaa aaatccatta cccgtgcgtt 1560 

tgatgacgat gttgagtttc aggagcgcat ggcagaacac atccggtaca tggttgaaac 1620 

cattgctcac caccaggttg atattgattc agaggtataa aacgaatgag tactgcactc 1680 

gcaacgctgg ctgggaagct ggctgaacgt gtcggcatgg attctgtcga cccacaggaa 1740 

ctgatcacca ctcttcgcca gacggcattt aaaggtgatg ccagcgatgc gcagttcatc 1800 

gcattactga tcgttgccaa ccagtacggc cttaatccgt ggacgaaaga aatttacgcc 1860 

tttcctgata agcagaatgg catcgttccg gtggtgggcg ttgatggctg gtcccgcatc 1920 

atcaatgaaa accagcagtt tgatggcatg gactttgagc aggacaatga atcctgtaca 1980 

tgccggattt accgcaagga ccgtaatcat ccgatctgcg ttaccgaatg gatggatgaa 2040 

tgccgccgcg aaccattcaa aactcgcgaa ggcagagaaa tcacggggcc gtggcagtcg 2100 

catcccaaac ggatgttacg tcataaagcc atgattcagt gtgcccgtct ggccttcgga 2160 

tttgctggta tctatgacaa ggatgaagcc gagcgcattg tcgaaaatac tgcatacact 2220 

gcagaacgtc agccggaacg cgacatcact ccggttaacg atgaaaccat gcaggagatt 2280 

aacactctgc tgatcgccct ggataaaaca tgggatgacg acttattgcc gctctgttcc 2340 

cagatatttc gccgcgacat tcgtgcatcg tcagaactga cacaggccga agcagtaaaa 2400 

gctcttggat tcctgaaaca gaaagccgca gagcagaagg tggcagcatg acaccggaca 2460 

ttatcctgca gcgtaccggg atcgatgtga gagctgtcga acagggggat gatgcgtggc 2520 

acaaattacg gctcggcgtc atcaccgctt cagaagttca caacgtgata gcaaaacccc 2580 
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gctccggaaa gaagtggcct gacatgaaaa tgtcctactt ccacaccctg cttgctgagg 2640 
tttgcaccgg tgtggctccg gaagttaacg ctaaagcact ggcctgggga aaacagtacg 2700 
agaacgacgc cagaaccctg tttgaattca cttccggcgt gaatgttact gaatccccga 2760 
tcatctatcg cgacgaaagt atgcgtaccg cctgctctcc cgatggttta tgcagtgacg 2820 
gcaacggcct tgaactgaaa tgcccgttta cctcccggga tttcatgaag ttccggctcg 2880 
gtggtttcga ggccataaag tcagcttaca tggcccaggt gcagtacagc atgtgggtga 2940 
cgcgaaaaaa tgcctggtac tttgccaact atgacccgcg tatgaagcgt gaaggcctgc 3000 
attatgtcgt gattgagcgg gatgaaaagt acatggcgag ttttgacgag atcgtgecgg 3060 
agttcatcga aaaaatggac gaggcactgg ctgaaattgg ttttgtattt ggggagcaat 3120 
ggcgatgacg catcctcacg ataatatccg ggtaggcgca atcactttcg tctactccgt 3180 

tacaaagcga ggctgggtat ttcccggcct ttctgttatc cgaaatccac tgaaagcaca 3240 

gcggctggct gaggagataa ataataaacg aggggctgta tgcacaaagc atcttctgtt 3300 

gagttaagaa cgagtatcga gatggcacat agccttgctc aaattggaat caggtttgtg 3360 

ccaataccag tagaaacaga cgaagaatcc atgggtatgg acagttttcc ctttgatatg 3420 

taacggtgaa cagttgttct acttttgttt gttagtcttg atgcttcact gatagataca 3480 

agagccataa gaacctcaga tccttccgta tttagccagt atgttctcta gtgtggttcg 3540 

ttgtttttgc gtgagccatg agaacgaacc attgagatca tacttacttt gcatgtcact 3600 

caaaaatttt gcctcaaaac tggtgagctg aatttttgca gttaaagcat cgtgtagtgt 3660 

ttttcttagt ccgttacgta ggtaggaatc tgatgtaatg gttgttggta ttttgtcacc 3720 

attcattttt atctggttgt tctcaagttc ggttacgaga tccatttgtc tatctagttc 3780 

aacttggaaa atcaacgtat cagtcgggcg gcctcgctta tcaaccacca atttcatatt 3840 

gctgtaagtg tttaaatctt tacttattgg tttcaaaacc cattggttaa gccttttaaa 3900 

ctcatggtag ttattttcaa gcattaacat gaacttaaat tcatcaaggc taatctctat 3960 

atttgccttg tgagttttct tttgtgttag ttcttttaat aaccactcat aaatcctcat 4020 

agagtatttg ttttcaaaag acttaacatg ttccagatta tattttatga atttttttaa 4080 

ctggaaaaga taaggcaata tctcttcact aaaaactaat tctaattttt cgcttgagaa 4140 

cttggcatag tttgtccact ggaaaatctc aaagccttta accaaaggat tcctgatttc 4200 

cacagttctc gtcatcagct ctctggttgc tttagctaat acaccataag cattttccct 4260 

actgatgttc atcatctgag cgtattggtt ataagtgaac gataccgtcc gttctttcct 4320 

tgtagggttt tcaatcgtgg ggttgagtag tgccacacag cataaaatta gcttggtttc 4380 

atgctccgtt aagtcatagc gactaatcgc tagttcattt gctttgaaaa caactaattc 4440 

agacatacat ctcaattggt ctaggtgatt ttaatcacta taccaattga gatgggctag 4500 

tcaatgataa ttactagtcc ttttcctttg agttgtgggt atctgtaaat tctgctagac 4560 

ctttgctgga aaacttgtaa attctgctag accctctgta aattccgcta gacctttgtg 4620 
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tgtttttttt gtttatattc aagtggttat aatttataga ataaagaaag aataaaaaaa 4680 

gataaaaaga atagatccca gccctgtgta taactcacta ctttagtcag ttccgcagta 4740 

ttacaaaagg atgtcgcaaa cgctgtttgc tcctctacaa aacagacctt aaaaccctaa 4800 

aggcttaagt agcaccctcg caagctcggt tgcggccgca atcgggcaaa tcgctgaata 4860 

ttccttttgt ctccgaccat caggcacctg agtcgctgtc tttttcgtga cattcagttc 4920 

gctgcgctca cggctctggc agtgaatggg ggtaaatggc actacaggcg ccttttatgg 4980 

attcatgcaa ggaaactacc cataatacaa gaaaagcccg tcacgggctt ctcagggcgt 5040 

tttatggcgg gtctgctatg tggtgctatc tgactttttg ctgttcagca gttcctgccc 5100 

tctgattttc cagtctgacc acttcggatt atcccgtgac aggtcattca gactggctaa 5160 

tgcacccagt aaggcagcgg tatcatcaac ggggtctgac gctcagtgga acgaaaactc 5220 

acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 5280 

ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 5340 

ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt 5400 

tgcctgactc cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag 5460 

tgctgcaatg ataccgcgag acccacgctc accggctcca gatttatcag caataaacca 5520 

gccagccgga agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc 5580 

tattaattgt tgccgggaag ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt 5640 

tgttgccatt gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag 5700 

ctccggttcc caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt 5760 

tagctccttc ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt tatcactcat 5820 

ggttatggca gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt 5880 

gactggtgag tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc 5940 

ttgcccggcg tcaatacggg ataataccgc gccacatagc agaactttaa aagtgctcat 6000 

cattggaaaa cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag 6060 

ttcgatgtaa cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt 6120 

ttctgggtga gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg 6180 

gaaatgttga atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta 6240 

ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc 6300 

gcgcacattt ccccgaaaag tgccacctg 6329 

<210> 66 
<211> 3423 
<212> DNA 

<213> Artificial sequence 
<220> 

<223> Plasmid psuH5 
<400> 66 

agattgcagc attacacgtc ttgagcgatt gtgtaggctg gagctgcttc gaagttccta 60 
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tactttctag agaataggaa cttcggaata ggaacttcaa gatcccctca cgctgccgca 120 

agcactcagg gcgcaagggc tgctaaagga agcggaacac gtagaaagcc agtccgcaga 180 

aacggtgctg accccggatg aatgtcagct actgggctat ctggacaagg gaaaacgcaa 240 

gcgcaaagag aaagcaggta gcttgcagtg ggcttacatg gcgatagcta gactgggcgg 300 

ttttatggac agcaagcgaa ccggaattgc cagctggggc gccctctggt aaggttggga 360 

agccctgcaa agtaaactgg atggctttct tgccgccaag gatctgatgg cgcaggggat 420 

caagatctga tcaagagaca ggatgaggat cgtttcgcat gattgaacaa gatggattgc 480 

acgcaggttc tccggccgct tgggtggaga ggctattcgg ctatgactgg gcacaacaga 540 

caatcggctg ctctgatgcc gccgtgttcc ggctgtcagc gcaggggcgc ccggttcttt 600 

ttgtcaagac cgacctgtcc ggtgccctga atgaactgca ggacgaggca gcgcggctat 660 

cgtggctggc cacgacgggc gttccttgcg cagctgtgct cgacgttgtc actgaagcgg 720 

gaagggactg gctgctattg ggcgaagtgc cggggcagga tctcctgtca tctcaccttg 780 

ctcctgccga gaaagtatcc atcatggctg atgcaatgcg gcggctgcat acgcttgatc 840 

cggctacctg cccattcgac caccaagcga aacatcgcat cgagcgagca cgtactcgga 900 

tggaagccgg tcttgtcgat caggatgatc tggacgaaga gcatcagggg ctcgcgccag 960 

ccgaactgtt cgccaggctc aaggcgcgca tgcccgacgg cgaggatctc gtcgtgaccc 1020 

atggcgatgc ctgcttgccg aatatcatgg tggaaaatgg ccgcttttct ggattcatcg 1080 

actgtggccg gctgggtgtg gcggaccgct atcaggacat agcgttggct acccgtgata 1140 

ttgctgaaga gcttggcggc gaatgggctg accgcttcct cgtgctttac ggtatcgccg 1200 

ctcccgattc gcagcgcatc gccttctatc gccttcttga cgagttcttc tgagcgggac 1260 

tctggggttc gaaatgaccg accaagcgac gcccaacctg ccatcacgag atttcgattc 1320 

caccgccgcc ttctatgaaa ggttgggctt cggaatcgtt ttccgggacg ccggctggat 1380 

gatcctccag cgcggggatc tcatgctgga gttcttcgcc caccccagct tcaaaagcgc 1440 

tctgaagttc ctatactttc tagagaatag gaacttcgga ataggaacta aggaggatat 1500 

tcactataaa aataggcgta tcacgaggcc ctttcgtctt cacctcgaga aatcataaaa 1560 

aatttatttg ctttgtgagc ggataacaat tataatagat tcaattgtga gcggataaca 1620 

atttcacaca gaattcatta aagaggagaa attaactcat atggaccatg gctaattccc 1680 

atgtcagccg ttaagtgttc ctgtgtcact gaaaattgct ttgagaggct ctaagggctt 1740 

ctcagtgcgt tacatccctg gcttgttgtc cacaaccgtt aaaccttaaa agctttaaaa 1800 

gccttatata ttcttttttt tcttataaaa cttaaaacct tagaggctat ttaagttgct 1860 

gatttatatt aattttattg ttcaaacatg agagcttagt acgtgaaaca tgagagctta 1920 

gtacgttagc catgagagct tagtacgtta gccatgaggg tttagttcgt taaacatgag 1980 

agcttagtac gttaaacatg agagcttagt acgtgaaaca tgagagctta gtacgtacta 2040 

tcaacaggtt gaactgcgga tcttgcggcc gcaaaaatta aaaatgaagt tttaaatcaa 2100 
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tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac 2160 

ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga 2220 

taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata ccgcgagacc 2280 

cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg gccgagcgca 2340 

gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc cgggaagcta 2400 

gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct acaggcatcg 2460 

tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa cgatcaaggc 2520 

gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg 2580 

ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca ctgcataatt 2640 

ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac tcaaccaagt 2700 

cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca atacgggata 2760 

ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt tcttcggggc 2820 

gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc actcgtgcac 2880 

ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa 2940 

ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct 3000 

tcctttttca atattattga agcatttatc agggttattg tctcatgagc ggatacatat 3060 

ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc 3120 

cacctgcatc gatggccccc cgatggtagt gtggggtctc cccatgcgag agtagggaac 3180 

tgccaggcat caaataaaac gaaaggctca gtcgaaagac tgggcctttc gttttatctg' 3240 

ttgtttgtcg gtgaacgctc tcctgagtag gacaaatccg ccgggagcgg atttgaacgt 3300 

tgcgaagcaa cggcccggag ggtggcgggc aggacgcccg ccataaactg ccaggcatca 3360 

aattaagcag aaggccatcc tgacggatgg cctttttgcg tggccagtgc caagcttgca 3420 

3423 
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